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About This Book 



The primary objective of this user’s manual is to define the functionality of the PowerPC 
603™ and PowerPC 603e™ microprocessors for use by software and hardware developers. 
Although the emphasis of this manual is upon the 603e, all of the information within 
applies to both the 603 and 603e, except for those differences noted in Appendix C, 
“PowerPC 603 Processor System Design and Programming Considerations.” Those 
readers who are primarily interested in the 603 should begin with Appendix C. 

The 603e is built upon the low-power dissipation, low-cost and high-performance attributes 
of the 603 while providing the system designer additional capabilities through higher 
processor clock speeds (to 100 MHz), increases in cache size (16-Kbyte instruction and 
data caches) and set-associativity (4-way), and greater system clock flexibility. The 603e 
only implements the 32-bit portion of the PowerPC Architecture™. 

It is important to note that this book is intended as a companion to the PowerPC^ 
Microprocessor Family: The Programming Environments, referred to as The Programming 
Environments Manual, contact your local sales representative to obtain a copy. Because the 
PowerPC architecture is designed to be flexible to support a broad range of processors. The 
Programming Environments Manual provides a general description of features that are 
common to PowerPC processors and indicates those features that are optional or that may 
be implemented differently in the design of each processor. 

The PowerPC 603 e RISC Microprocessor User*s Manual summarizes features of the 603e 
that are not defined by the architecture. This document and The Programming 
Environments Manual distinguish between the three levels, or programming environments, 
of the PowerPC architecture, which are as follows: 

• PowerPC user instruction set architecture (UISA) — The UISA defines the level of 
the architecture to which user-level software should conform. The UISA defines the 
base user-level instruction set, user-level registers, data types, memory conventions, 
and the memory and programming models seen by application programmers. 

• PowerPC virtual environment architecture ( VEA) — ^The VEA, which is the smallest 
component of the PowerPC architecture, defines additional user-level functionality 
that falls outside typical user-level software requirements. The VEA describes the 
memory model for an environment in which multiple processors or other devices 
can access external memory, defines aspects of the cache model and cache control 
instructions from a user-level perspective. The resources defined by the VEA are 
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particularly useful for optimizing memory accesses and for managing resources in 
an environment in which other processors and other devices can access external 
memory. 

• PowerPC operating environment architecture (OEA) — ^The OEA defines 

supervisor-level resources typically required by an operating system. The OEA 
defines the PowerPC memory management model, supervisor-level registers, and 
the exception model. 

Implementations that conform to the PowerPC OEA also conform to the PowerPC 
UISAandVEA. 

It is important to note that some resources are defined more generally at one level in the 
architecture and more specifically at another. For example, conditions that cause a floating- 
point exception are defined by the UISA, while the exception mechanism itself is defined 
by the OEA. 

Because it is important to distinguish between the levels of the architecture in order to 
ensure compatibility across multiple platforms, those distinctions are shown clearly 
throughout this book. 

For ease in reference, this book has arranged topics described by the architecture into topics 
that build upon one another, beginning with a description and complete summary of 603e- 
specific registers and progressing to more specialized topics such as 603e-specific details 
regarding the cache, exception, and memory management models. As such, chapters may 
include information from multiple levels of the architecture. (For example, the discussion 
of the cache model uses information from both the VEA and the OEA.) 

The PowerPC Architecture: A Specification for a New Family of RISC Processors defines 
the architecture from the perspective of the three programming environments and remains 
the defining document for the PowerPC architecture. 

The information in this book is subject to change without notice, as described in the 
disclaimers on the title page of this book. As with any technical documentation, it is the 
readers’ responsibility to be sure they are using the most recent version of the 
documentation. For more information, contact your sales representative. 



Audience 

This manual is intended for system software and hardware developers and applications 
programmers who want to develop products for the 603e. It is assumed that the reader 
understands operating systems, microprocessor system design, the basic principles of RISC 
processing, and details of the PowerPC architecture. 
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Organization 

Following is a summary and a brief description of the major sections of this manual: 

• Chapter 1, “Overview,” is useful for readers who want a general understanding of 
the features and functions of the PowerPC architecture and the 603e. This chapter 
describes the flexible nature of the PowerPC architecture definition, and provides an 
overview of how the PowerPC architecture defines the register set, operand 
conventions, addressing modes, instruction set, cache model, exception model, and 
memory management model. 

• Chapter 2, “PowerPC 603e Microprocessor Programming Model,” provides a brief 
synopsis of the registers implemented in the 603e, operand conventions, an 
overview of the PowerPC addressing modes, and a list of the instructions 
implemented by the 603e. Instructions are organized by function. 

• Chapter 3, “Instruction and Data Cache Operation,” provides a discussion of the 
cache and memory model as implemented on the 603e. 

• Chapter 4, “Exceptions,” describes the exception model defined in the PowerPC 
OEA and the specific exception model implemented on the 603e. 

• Chapter 5, “Memory Management,” describes the 603e’s implementation of the 
memory management unit specifications provided by the PowerPC OEA for 
PowerPC processors. 

• Chapter 6, “Instruction Timing,” provides information about latencies, interlocks, 
special situations, and various conditions to help make programming more efficient. 
This chapter is of special interest to software engineers and system designers. 

• Chapter 7, “Signal Descriptions,” provides descriptions of individual signals of the 
603e. 

• Chapter 8, “System Interface Operation,” describes signal timings for various 
operations. It also provides information for interfacing to the 603e. 

• Chapter 9, “Power Management,” provides information about power saving modes 
for the 603e. 

• Appendix A, “PowerPC Instruction Set Listings,” lists all the PowerPC instructions 
while indicating those instructions that are not implemented by the 603e; it also 
includes the instructions that are specific to the 603e. Instructions are grouped 
according to mnemonic, opcode, function, and form. Also included is a quick 
reference table that contains general information, such as the architecture level, 
privilege level, and form, and indicates if the instruction is 64-bit and optional. 

• Appendix B, “Instructions Not Implemented,” provides a list of PowerPC 
instructions not implemented on the 603e. 

• Appendix C, “PowerPC 603 Processor System Design and Programming 
Considerations,” provides a discussion of the hardware and software differences 
between the 603 and 603e. 

• This manual also includes a glossary and an index. 
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In this document, the term “603e” is used as an abbreviation for the phrase, “PowerPC 
603e microprocessor,” and the term “603” is used as an abbreviation for the phrase, 
“PowerPC ^3 microprocessor.” The PowerPC 603e microprocessors are available from 
IBM as PPC603e and from Motorola as MPC603e. 

Additional Reading 

Following is a list of additional reading that provides background for the information in 
this manual: 

• PowerPC Microprocessor Family: The Programming Environments, MPCFPE/AD 
(Motorola Order Number) and MPRPPCFPE-01 (IBM Order Number) 

• The PowerPC Architecture: A Specification for a New Family of RISC Processors, 
Second Edition, Morgan Kaufmann Publishers, Inc., San Francisco, CA 

• John L. Hennessy and David A. Patterson, Computer Architecture: A Quantitative 
Approach, Morgan Kaufmann Publishers, Inc., San Mateo, CA 

• PowerPC 601 RISC Microprocessor User*s Manual, Rev 1 
MPC601UM/AD (Motorola Order Number) and 52G7484/(MPR601UMU-02) 
(IBM Order Number) 

• PowerPC 601 RISC Microprocessor Technical Summary, Rev 1 
MPC601/D (Motorola order number) and MPR601TSU-02 (IBM order number) 

• PowerPC 603 RISC Microprocessor Technical Summary, Rev 3 
MPC603/D (Motorola order number) and MPR603TSU-03 (IBM order number) 

• PowerPC 603 RISC Microprocessor Hardware Specifications, Rev 2 
MPC603EC/D (Motorola order #) and MPR603HSU-03 (IBM order #) 

• PowerPC 603 e RISC Microprocessor Technical Summary, Rev 0 
MPC603E/D (Motorola order number) and MPR603TSU-04 (IBM order number) 

• PowerPC 603e RISC Microprocessor Hardware Specifications, Rev 0 
MPC603EEC/D (Motorola order #) and MPR603EHS-01 (IBM order #) 

• PowerPC 604 RISC Microprocessor User*s Manual, MPC604UM/AD (Motorola 
order number) and MPR604UMU-01 (IBM order number) 

• PowerPC 604 RISC Microprocessor Technical Summary, Rev 1 
MPC604/D (Motorola order number) and MPR604TSU-02 (IBM order number) 

• PowerPC 620 RISC Microprocessor Technical Summary, MPC620/D (Motorola 
order number) and MPR620TSU-01 (IBM order number) 
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Conventions 

This document uses the following notational conventions: 



ACTIVE.HIGH 



ACTIVE_LOW 



mnemonics 

italics 

0x0 

ObO 

rA, rB 

rAIO 

rD 

frA, frB, frC 
frD 

REG[FIELD] 



X 

n 



Names for signals that are active high are shown in uppercase text 
without an overbar. 

A bar ov er a signa l name indicates that the signal is active low — ^for 
example, ARTRY (address retry) and TS (transfer start). Active-low 
signals are referred to as asserted (active) when they are low and 
negated when they are high. Signals that are not active low, such as 
AP0-AP3 (address bus parity signals) and TT0-TT4 (transfer type 
signals) are referred to as asserted when they are high and negated 
when they are low. 

Instruction mnemonics are shown in lowercase bold. 

Italics indicate variable command parameters, for example, bcctrx 

Prefix to denote hexadecimal number 

Prefix to denote binary number 

Instruction syntax used to identify a source GPR 

The contents of a specified GPR or the value 0. 

Instruction syntax used to identify a destination GPR 
Instruction syntax used to identify a source FPR 
Instruction syntax used to identify a destination FPR 

Abbreviations or acronyms for registers are shown in uppercase text. 
Specific bits, fields, or ranges appear in brackets. For example, 
MSR[LE] refers to the little-endian mode enable bit in the machine 
state register. 

In certain contexts, such as a signal encoding, this indicates a don’t 
care. 

Used to express an undefined numerical value. 



Acronyms and Abbreviations 

Table i contains acronyms and abbreviations that are used in this document. 



Table i. Acronyms and Abbreviated Terms 



Term 


Meaning 


ALU 


Arithmetic logic unit 


ATE 


Automatic test equipment 


ASR 


Address space register 


BAT 


Block address translation 



About This Book 



xxix 















Table i. Acronyms and Abbreviated Terms (Continued) 



Term 


Meaning 


BIST 


Built-in seif test 


BIU 


Bus interface unit 


BPU 


Branch processing unit 


BUG 


Bus unit controller 


BUID 


Bus unit ID 


CAR 


Cache address register 


CIA 


Current instruction address 


CMOS 


Complementary metal-oxide semiconductor 


COP 


Common on-chip processor 


CR 


Condition register 


CRTRY 


Cache retry queue 


CTR 


Count register 


DAR 


Data address register 


DBAT 


Data BAT 


DCMP 


Data TLB compare 


DEC 


Decrementer register 


DMiSS 


Data TLB niiss address 


DSISR 


Register used for determining the source of a DSI exception 


DTLB 


Data translation lookaside buffer 


EA 


Effective address 


EAR 


External access register 


ECC 


Error checking and correction 


FIFO 


First-in-first-out 


FPECR 


Floating-point exception cause register 


FPR 


Floating-point register 


FPSCR 


Floating-point status and control register 


FPU 


Floating-point unit 


GPR 


General-purpose register 


HASH1 


Primary hash address 


HASH2 


Secondary hash address 


lABR 


Instruction address breakpoint register 


IBAT 


Instruction BAT 
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Table i. Acronyms and Abbreviated Terms (Continued) 



Term 


Meaning 


ICMP 


Instruction TLB compare 


IEEE 


Institute for Electrical and Electronics Engineers 


IMISS 


Instruction TLB miss address 


IQ 


Instruction queue 


ITLB 


Instruction translation lookaside buffer 


lU 


Integer unit 


L2 


Secondary cache 


LIFO 


Last-in-first-out 


LR 


Link register 


LRU 


Least recently used 


LSB 


Least-significant byte 


Isb 


Least-significant bit 


LSU 


Load/store unit 


MEI 


Modified/exclusive/invalid 


MESI 


Modified/exclusive/shared/invalid — cache coherency protocol 


MMU 


Memory management unit 


MQ 


MQ register 


MSB 


Most-significant byte 


msb 


Most-significant bit 


MSR 


Machine state register 


NaN 


Not a number 


No-op 


No operation 


OEA 


Operating environment architecture 


PID 


Processor identification tag 


PIR 


Processor identification register 


PLL 


Phase-locked loop 


POWER 


Performance Optimized with Enhanced RISC architecture 


PR 


Privilege-level bit 


PTE 


Page table entry 


PTEG 


Page table entry group 


PVR 


Processor version register 


RAW 


Read-after-write 
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Table i. Acronyms and Abbreviated Terms (Continued) 



Term 


Meaning 


RISC 


Reduced instruction set computing 


RPA 


Required physical address 


RTC 


Real-time clock 


RTCL 


Real-time clock lower register 


RTCU 


Real-time clock upper register 


RTL 


Register transfer language 


RWITM 


Read with intent to modify 


SDR1 


Register that specifies the page table base address for virtual-to-physical address translation 


SLB 


Segment lookaside buffer 


SPR 


Special-purpose register 


SR 


Segment register 


SRRO 


Machine status save/restore register 0 


SRR1 


Machine status save/restore register 1 


SRU 


System register unit 


TAP 


Test access port 


TB 


Time base facility 


TBL 


Time base lower register 


TBU 


Time base upper register 


TLB 


Translation lookaside buffer 


TTL 


Transistor-to-transistor logic 


UIMM 


Unsigned immediate value 


UlSA 


User instruction set architecture 


UTLB 


Unified translation lookaside buffer 


UUT 


Unit under test 


VEA 


Virtual environment architecture 


WAR 


WrIte-after-read 


WAW 


Write-after-write 


WIMG 


Write-through/caching-inhIbited/memory-coherency enforced/guarded bits 


XATC 


Extended address transfer code 


XER 


Register used for Indicating conditions such as carries and overflows for integer operations 



XXXII 



PowerPC 603e RISC Microprocessor User's Manual 



































































Terminology Conventions 

Table ii describes terminology conventions used in this manual. 

Table ii. Terminology Conventions 



The Architecture Specification 


This Manual 


Data storage interrupt (DSI) 


DSI exception 


Extended mnemonics 


Simplified mnemonics 


Fixed-point unit (FXU) 


Integer unit (ID) 


Instruction storage interrupt (ISI) 


ISI exception 


Interrupt 


Exception 


Privileged mode (or privileged state) 


Supervisor-level privilege 


Problem mode (or problem state) 


User-level privilege 


Real address 


Physical address 


Relocation 


Translation 


Storage (locations) 


Memory 


Storage (the act of) 


Access 


Store in 


Write back 


Store through 


Write through 



Table iii describes instruction field notation used in this manual. 

Table iii. Instruction Field Conventions 



The Architecture Specification 


Equivalent to: 


BA, BB, BT 


crbA, crbB, crbD (respectively) 


BF, BFA 


crfD,' crfS (respectively) 


D 


d 


DS 


ds 


FLM 


FM 


FRA, FRB, FRC, FRT, FRS 


frA, frB, frC, frD, frS (respectively) 


FXM 


CRM 

i 


RA, RB, RT, RS 


rA, rB, rD, rS (respectively) 


SI 


SIMM 


U 


IMM 


Ul 


UIMM 


/, //, /// 


0...0 (shaded) 





























































Chapter 1 
Overview 



This chapter provides an overview of PowerPC 603e™ microprocessor features and the 
PowerPC Architecture™, and information about how the 603e implementation complies 
with the architectural definitions. 

1 .1 PowerPC 603e Microprocessor Overview 

This section describes the features of the 603e, provides a block diagram showing the major 
functional units, and gives an overview of how the 603e operates. 

The 603e is a low-power implementation of the PowerPC™ family of reduced instruction 
set computer (RISC) microprocessors. The 603e implements the 32-bit portion of the 
PowerPC architecture, which provides 32-bit effective addresses, integer data types of 8, 
16, and 32 bits, and floating-point data types of 32 and 64 bits. For 64-bit PowerPC 
microprocessors, the PowerPC architecture provides 64-bit integer data types, 64-bit 
addressing, and other features required to complete the 64-bit architecture. 

The 603e provides four software controllable power-saving modes. Three of the modes (the 
nap, doze, and sleep modes) are static in nature, and progressively reduce the amount of 
power dissipated by the processor. The fourth is a dynamic power management mode that 
causes the functional units in the 603e to automatically enter a low-power mode when the 
functional units are idle without affecting operational performance, software execution, or 
any external hardware. 

The 603e is a superscalar processor capable of issuing and retiring as many as three 
instructions per clock. Instructions can execute out of order for increased performance; 
however, the 603e makes completion appear sequential. 

The 603e integrates five execution units — an integer unit (lU), a floating-point unit (FPU), 
a branch processing unit (BPU), a load/store unit (LSU), and a system register unit (SRU). 
The ability to execute five instructions in parallel and the use of simple instructions with 
rapid execution times yield high efficiency and throughput for 603e-based systems. Most 
integer instructions execute in one clock cycle. The FPU is pipelined so a single-precision 
multiply-add instruction can be issued every clock cycle. 
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The 603e provides independent on-chip, 16-Kbyte, four- way set-associative, physically 
addressed caches for instructions and data and on-chip instruction and data memory 
management units (MMUs). The MMUs contain 64-entry, two-way set-associative, data 
and instruction translation lookaside buffers (DTLB and ITLB) that provide support for 
demand-paged virtual memory address translation and variable-sized block translation. 
The TLBs and caches use a least recently used (LRU) replacement algorithm. The 603e also 
supports block address translation through the use of two independent instruction and data 
block address translation (IB AT and DB AT) arrays of four entries each. Effective addresses 
are compared simultaneously with all four entries in the BAT array during block 
translation. In accordance with the PowerPC architecture, if an effective address hits in 
both the TLB and BAT array, the BAT translation takes priority. 

The 603e has a selectable 32- or 64-bit data bus and a 32-bit address bus. The 603e interface 
protocol allows multiple masters to compete for system resources through a central external 
arbiter. The 603e provides a three-state coherency protocol that supports the exclusive, 
modified, and invalid cache states. This protocol is a compatible subset of the MESI 
(modified/exclusive/shared/invalid) four-state protocol and operates coherently in systems 
that contain four- state caches. The 603e supports single-beat and burst data transfers for 
memory accesses, and supports memory-mapped I/O. 

The 603e uses an advanced, 3.3-V CMOS process technology and maintains full interface 
compatibility with TTL devices. 

1-1.1 PowerPC 603e Microprocessor Features 

This section describes details of the 603e’s implementation of the PowerPC architecture. 
Major features of the 603e are as follows: 

• High-performance, superscalar microprocessor 

— As many as three instructions issued and retired per clock 
— As many as five instructions in execution per clock 
— Single-cycle execution for most instructions 

— Pipelined FPU for all single-precision and most double-precision operations 

• Five independent execution units and two register files 
— BPU featuring static branch prediction 

— A32-bitIU 

— Fully IEEE 754-compliant FPU for both single- and double-precision operations 
— LSU for data transfer between data cache and GPRs and FPRs 

— SRU that executes condition register (CR), special-purpose register (SPR) 
instructions, and integer add/compare instructions 

— Thirty-two GPRs for integer operands 
— Thirty-two FPRs for single- or double-precision operands 
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• High instruction and data throughput 

— Zero-cycle branch capability (branch folding) 

— Programmable static branch prediction on unresolved conditional branches 

— Instruction fetch unit capable of fetching two instructions per clock from the 
instruction cache 

— A six-entry instruction queue that provides lookahead capability 

— Independent pipelines with feed-forwarding that reduces data dependencies in 
hardware 

— 16-Kbyte data cache — ^four-way set-associative, physically addressed; LRU 
replacement algorithm 

— 16-Kbyte instruction cache — ^four-way set-associative, physically addressed; 
LRU replacement algorithm 

— Cache write-back or write-through operation programmable on a per page or per 
block basis 

— BPU that performs CR lookahead operations 

— Address translation facilities for 4-Kbyte page size, variable block size, and 
256-Mbyte segment size 

— A 64-entry, two-way set-associative ITLB 

— A 64-entry, two-way set-associative DTLB 

— Four-entry data and instruction BAT arrays providing 128-Kbyte to 256-Mbyte 
blocks 

— Software table search operations and updates supported through fast trap 
mechanism 

— 52-bit virtual address; 32-bit physical address 

• Facilities for enhanced system performance 

— A 32- or 64-bit split-transaction external data bus with burst transfers 

— Support for one-level address pipelining and out-of-order bus transactions 

• Integrated power management 

— Low-power 3.3-volt design 

— Internal processor/bus clock multiplier that provides 1:1, 1.5:1, 2:1, 2.5:1, 3:1, 
3.5:1, and 4:1 ratios 

— Three power saving modes: doze, nap, and sleep 

— Automatic dynamic power reduction when internal functional units are idle 

• In-system testability and debugging features through JTAG boundary-scan 

capability 
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1.1.2 PowerPC 603™ and PowerPC 603e Microprocessors — System 
Design and Programming Considerations 

The 603e is built upon the low power dissipation, low cost and high performance attributes 
of the 603 while providing the system designer additional capabilities through higher 
processor clock speeds (to 100 MHz), increases in cache size (16-Kbyte instruction and 
data caches) and set associativity (four-way), and greater system clock flexibility. The 
following subsections describe the differences between the 603 and the 603e that affect the 
system designer and programmer already familiar with the operation of the 603. 

The design enhancements to the 603e are described in the following sections as changes 
that can require a modification to the hardware or software configuration of a system 
designed for the 603. 

1.1. 2.1 Hardware Features 

The following hardware features of the 603e may require system designers to modify 
systems designed for the 603. 

1 .1 .2.1 .1 Replacement of XATS Signal by CSE1 Signal 

The 603e employs four-way set associativity for both the instruction and data caches, in 
place of the two-way set associativity used in the 603. This change requires the use of an 
additional cache set entry (CSEl) signal to indicate which member of the cache set is being 
loa ded dur ing a cache line fill. The CSEl si gnal on the 603e is in the same pin location as 
the XATS signal on the 603. Note that the XATS signal is no longer needed by the 603e 
because support for access to direct-store segments has been removed. 

Table 1-1 shows the CSEO-CSEl signal encoding indicating the cache set element selected 
during a cache load operation. 



Table 1-1. CSEO-CSEl Signals 



CSEO-CSEl 


Cache Set Element 


00 


SetO 


01 


Setl 


10 


Set 2 


11 


Seta 



1.1. 2. 1.2 Addition of Half-Clock Bus Multipliers 

Some of the reserved clock configuration signal settings of the 603 are redefined to allow 
more flexible selection of higher internal and bus clock frequencies. See Table 1-2 for the 
PLL_CFG0-PLL_CFG3 signal settings provided by the 603e. 
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Table 1-2. PowerPC 603e Microprocessor PLL Configuration 



PLL_CFG0 

to 

PLL_CFG3 


CPU Frequency in MHz 


CPU/SYSCLK 

Ratio 


Bus 

16.6 MHz 


Bus 
20 MHz 


Bus 
25 MHz 


Bus 

33.3 MHz 


Bus 
40 MHz 


Bus 
50 MHz 


Bus 
60 MHz 


Bus 

66.6 MHz 


0000 


1:1 


— 


— 


— 


— 


— 


— 


60 

(120) 


66.6 

(133) 


0001 


1:1 


— 


— 


— 


33.3 

(133) 


40 

(160) 


50 

(200) 




— 


0010 


1:1 


16.6 

(133) 


20 

(160) 


25 

(200) 


— 


— 


— 


— 


— 


0100 


2:1 


— 


— 


— 


66.6 

(133) 


80 

(160) 


100 

(200) 


— 


— 


0101 


2:1 


33.3 

(133) 


40 

(160) 


50 

(200) 


— 


— 


— 


— 


— 


0110 


2.5:1 








83.3 

(166) 


100 

(200) 








1000 


3:1 


— 


— 


75 

(150) 


100 

(200) 


— 


■ 


■ 


— 


1010 


4:1 


66.6 

(133) 


80 

(160) 


100 

(200) 


— 


■ 








1100 


1.5:1 








— 


m 


75 

(150) 


90 

(180) 


100 

(200) 


1110 


3.5:1 




70 

(140) 


87.5 

(175) 












0011 


PLL bypass 


1111 


Clock off 



Notes; 



1. Some PLL configurations may select bus, CPU, or VCO frequencies which are not supported, or not tested 
for by the 603e. PLL frequencies (shown in parenthesis in Table 1-2) should not fall below 133 MHz and 
should not exceed 200 MHz. 

2. In PLL bypass mode, the SYSCLK input signal clocks the internal processor directly, the PLL is disabled, and 
the bus is set for 1:1 mode operation. 

3. In clock-off mode, no clocking occurs inside the 603e regardless of the SYSCLK input. 
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1.1 .2.2 Software Features 

The features of the 603e described in the following sections affect software originally 
written for the 603. 

1.1 .2.2.1 16-Kbyte Instruction and Data Caches 

The instruction and data caches of the 603e are 16 Kbytes in size, compared to the 8-Kbyte 
instruction and data caches of the 603. The increase in cache size may require modification 
of cache flush routines. The increase in cache size is also reflected in four-way set 
associativity of the instruction and data caches in place of the two-way set associativity in 
the 603. 

1.1 .2.2.2 Clock Configuration Available in MIDI Register 

Bits 0-3 in the new HEDl register (SPR 1009) provides software read-only access to the 
configuration of the PLL_CFG signals. The HIDl register is not implemented in the 603. 

1.1. 2.2.3 Performance Enhancements 

The following enhancements provide improved performance without any required changes 
to software (other than compiler optimization) or hardware designed for the 603: 

• Support for single-cycle store. 

• Addition of adder/comparator in system register unit allows dispatch and execution 
of multiple integer add and compare instructions on each cycle. 

• Addition of a key bit (bit 12) to SRRl to provide information about memory 
protection violations prior to page table search operations. This key bit is set when 
the combination of the settings in the appropriate Kx bit in the segment register and 
the MSR[PR] bit indicates that when the PP bits in the PTE are set to either 00 or 
01, a protection violation exists; if this is the case for a data write operation with a 
DTLB miss, the changed (C) bit in the page tables should not be updated (see 
Table 1-3). This reduces the time required to execute the page table search routine 
as the software no longer has to explicitly read both the lOc and MSR[PR] bits to 
determine whether a protection violation exists before updating the C bit. 

Table 1-3. SRRl [Key] Bit Generated by PowerPC 603e Microprocessor 



Segment Register 
[Ks, Kp] 


MSR[PR] 


SRRl [Key] Generated 
on DTLB Misses 


Ox 


0 


0 


xO 


1 


0 


lx 


0 


1 


Xl 


1 


1 



Note that this key bit indicates a protection violation if the 
PTE[pp] bits are either 00 or Olt 
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1.1.3 Block Diagram 

Figure 1-1 provides a block diagram of the 603e that illustrates how the execution 
units — lU, FPU, BPU, LSU, and SRU — operate independently and in parallel. 

The 603e provides address translation and protection facilities, including an ITLB, DTLB, 
and instruction and data BAT arrays. Instruction fetching and issuing is handled in the 
instruction unit. Translation of addresses for cache or external memory accesses are 
handled by the MMUs. Both units are discussed in more detail in Sections 1.1.4, 
“Instruction Unit,” and 1. 1.6.1, “Memory Management Units (MMUs).” 

1.1.4 Instruction Unit 

As shown in Figure 1-1, the 603e instruction unit, which contains a fetch unit, instruction 
queue, dispatch unit, and BPU, provides centralized control of instruction flow to the 
execution units. The instruction unit determines the address of the next instruction to be 
fetched based on information from the sequential fetcher and from the BPU. 

The instruction unit fetches the instructions from the instruction cache into the instruction 
queue. The BPU extracts branch instructions from the fetcher and uses static branch 
prediction on unresolved conditional branches to allow the instruction unit to fetch 
instructions from a predicted target instruction stream while a conditional branch is 
evaluated. The BPU folds out branch instructions for unconditional branches or conditional 
branches unaffected by instructions in progress in the execution pipeline. 

Instructions issued beyond a predicted branch do not complete execution until the branch 
is resolved, preserving the programming model of sequential execution. If any of these 
instructions are to be executed in the BPU, they are decoded but not issued. Instructions to 
be executed by the FPU, lU, LSU, and SRU are issued and allowed to complete up to the 
register write-back stage. Write-back is allowed when a correctly predicted branch is 
resolved, and instruction execution continues without interruption along the predicted path. 

If branch prediction is incorrect, the instruction unit flushes all predicted path instructions, 
and instructions are issued from the correct path. 

1.1 .4.1 Instruction Queue and Dispatch Unit 

The instruction queue (IQ), shown in Figure 1-1, holds as many as six instructions and 
loads up to two instructions from the instruction unit during a single cycle. The instruction 
fetch unit continuously loads as many instructions as space in the IQ allows. Instructions 
are dispatched to their respective execution units from the dispatch unit at a maximum rate 
of two instructions per cycle. Dispatching is facilitated to the lU, FPU, LSU, and SRU by 
the provision of a reservation station at each unit. The dispatch unit performs source and 
destination register dependency checking, determines dispatch serializations, and inhibits 
subsequent instruction dispatching as required. 

For a more detailed overview of instruction dispatch, see Section 1.3.7, “Instruction 
Timing.” 
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Figure 1-1. PowerPC 603e Microprocessor Biock Diagram 
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1.1. 4.2 Branch Processing Unit (BPU) 

The BPU receives branch instructions from the fetch unit and performs CR lookahead 
operations on conditional branches to resolve them early, achieving the effect of a 
zero-cycle branch in many cases. 

The BPU uses a bit in the instruction encoding to predict the direction of the conditional 
branch. Therefore, when an unresolved conditional branch instruction is encountered, the 
603e fetches instructions from the predicted target stream until the conditional branch is 
resolved. 

The BPU contains an adder to compute branch target addresses and three user-control 
registers — the link register (LR), the count register (CTR), and the CR. The BPU calculates 
the return pointer for subroutine calls and saves it into the LR for certain types of branch 
instructions. The LR also contains the branch target address for the Branch Conditional to 
Link Register (bclrx) instruction. The CTR contains the branch target address for the 
Branch Conditional to Count Register (bcctrjc) instruction. The contents of the LR and 
CTR can be copied to or from any GPR. Because the BPU uses dedicated registers rather 
than GPRs or FPRs, execution of branch instructions is largely independent from execution 
of integer and floating-point instructions. 

1.1.5 Independent Execution Units 

The PowerPC architecture’s support for independent execution units allows 
implementation of processors with out-of-order instruction execution. For example, 
because branch instructions do not depend on GPRs or FPRs, branches can often be 
resolved early, eliminating stalls caused by taken branches. 

In addition to the BPU, the 603e provides four other execution units and a completion unit, 
which are described in the following sections. 

1.1. 5.1 Integer Unit (lU) 

The lU executes all integer instructions. The lU executes one integer instruction at a time, 
performing computations with its arithmetic logic unit (ALU), multiplier, divider, and 
integer exception register (XER). Most integer instructions are single-cycle instructions. 
Thirty-two general-purpose registers are provided to support integer operations. Stalls due 
to contention for GPRs are minimized by the automatic allocation of rename registers. The 
603e writes the contents of the rename registers to the appropriate GPR when integer 
instructions are retired by the completion unit. 
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1.1. 5.2 Floating-Point Unit (FPU) 

The FPU contains a single-precision multiply-add array and the floating-point status and 
control register (FPSCR). The multiply-add array allows the 603e to efficiently implement 
multiply and multiply-add operations. The FPU is pipelined so that single-precision 
instructions and double-precision instructions can be issued back-to-back. Thirty-two 
floating-point registers are provided to support floating-point operations. Stalls due to 
contention for FPRs are minimized by the automatic allocation of rename registers. The 
603e writes the contents of the rename registers to the appropriate FPR when floating-point 
instructions are retired by the completion unit. 

The 603e supports all IEEE 754 floating-point data types (normalized, denormalized, NaN, 
zero, and infinity) in hardware, eliminating the latency incurred by software exception 
routines. (The term, ‘exception’ is also referred to as ‘interrupt’ in the architecture 
specification.) 

1.1 .5.3 Load/Store Unit (LSU) 

The LSU executes all load and store instructions and provides the data transfer interface 
between the GPRs, FPRs, and the cache/memory subsystem. The LSU calculates effective 
addresses, performs data alignment, and provides sequencing for load/store string and 
multiple instructions. 

Load and store instructions are issued and translated in program order; however, the actual 
memory accesses can occur out of order. Synchronizing instructions are provided to 
enforce strict ordering. 

Cacheable loads, when free of data dependencies, execute in an out-of-order manner with 
a maximum throughput of one per cycle and a two-cycle total latency. Data returned from 
the cache is held in a rename register until the completion logic commits the value to a GPR 
or FPR. Stores cannot be executed in a predicted manner and are held in the store queue 
until the completion logic signals that the store operation is to be completed to memory. The 
603e executes store instructions with a maximum throughput of one per cycle and a 
three-cycle total latency. The time required to perform the actual load or store operation 
varies depending on whether the operation involves the cache, system memory, or an I/O 
device. 

1.1. 5.4 System Register Unit (SRU) 

The SRU executes various system-level instructions, including condition register logical 
operations and move to/from special-purpose register instructions, and also executes 
integer add/compare instructions. In order to maintain system state, most instructions 
executed by the SRU are completion-serialized; that is, the instruction is held for execution 
in the SRU until all prior instructions issued have completed. Results from 
completion-serialized instructions executed by the SRU are not available or forwarded for 
subsequent instructions until the instruction completes. 
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1.1. 5.5 Completion Unit 

The completion unit tracks instructions from dispatch through execution, and then retires, 
or “completes” them in program order. Completing an instruction commits the 603e to any 
architectural register changes caused by that instruction. In-order completion ensures the 
correct architectural state when the 603e must recover from a mispredicted branch or any 
exception. 

Instruction state and other information required for completion is kept in a first-in-first-out 
(FIFO) queue of five completion buffers. A single completion buffer is allocated for each 
instruction once it enters the dispatch unit. An available completion buffer is a required 
resource for instruction dispatch; if no completion buffers are available, instruction 
dispatch stalls. A maximum of two instructions per cycle are completed in order from the 
queue. 



1.1.6 Memory Subsystem Support 

The 603e provides support for cache and memory management through dual instruction 
and data memory management units. The 603e also provides dual 16-Kbyte instruction and 
data caches, and an efficient processor bus interface to facilitate access to main memory and 
other bus subsystems. The memory subsystem support functions are described in the 
following subsections. 

1.1. 6.1 Memory Management Units (MMUs) 

The 603e’s MMUs support up to 4 Petabytes (2^2) of virtual memory and 4 Gigabytes ( 2 ^^) 
of physical memory (referred to as real memory in the architecture specification) for 
instruction and data. The MMUs also control access privileges for these spaces on block 
and page granularities. Referenced and changed status is maintained by the processor for 
each page to assist implementation of a demand-paged virtual memory system. A key bit is 
implemented to provide information about memory protection violations prior to page table 
search operations. 

The LSU calculates effective addresses for data loads and stores, performs data alignment 
to and from cache memory, and provides the sequencing for load and store string and 
multiple word instructions. The instruction unit calculates the effective addresses for 
instruction fetching. 

After an address is generated, the higher-order bits of the effective address are translated 
by the appropriate MMU into physical address bits. Simultaneously, the lower-order 
address bits (that are untranslated and therefore, considered both logical and physical), are 
directed to the on-chip caches where they form the index into the four-way set-associative 
tag array. After translating the address, the MMU passes the higher-order bits of the 
physical address to the cache, and the cache lookup completes. For caching-inhibited 
accesses or accesses that miss in the cache, the untranslated lower-order address bits are 
concatenated with the translated higher-order address bits; the resulting 32-bit physical 
address is then used by the memory unit and the system interface, which accesses external 
memory. 
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The MMU also directs the address translation and enforces the protection hierarchy 
programmed by the operating system in relation to the supervisor/user privilege level of the 
access and in relation to whether the access is a load or store. 

For instruction accesses, the MMU performs an address lookup in both the 64 entries of the 
ITLB, and in the IB AT array. If an effective address hits in both the ITLB and the IB AT 
array, the IB AT array translation takes priority. Data accesses cause a lookup in the DTLB 
and DBAT array for the physical address translation. In most cases, the physical address 
translation resides in one of the TLBs and the physical address bits are readily available to 
the on-chip cache. 

When the physical address translation misses in the TLBs, the 603e provides hardware 
assistance for software to perform a search of the translation tables in memory. The 
hardware assist consists of the following features: 

• Automatic storage of the missed effective address in the IMISS and DMISS registers 

• Automatic generation of the primary and secondary hashed real address of the page 
table entry group (PTEG), which are readable from the HASHl and HASH2 register 
locations. 

The HASH data is generated from the contents of the IMISS or DMISS register. 
Which register is selected depends on which miss (instruction or data) was last 
acknowledged. 

• Automatic generation of the first word of the page table entry (PTE) for which the 
tables are being searched 

• A real page address (RPA) register that matches the format of the lower word of the 
PTE 

• Two TLB access instructions (tlbli and tlbld) that are used to load an address 
translation into the instruction or data TLBs 

• Shadow registers for GPRs 0-3 that allow miss code to execute without corrupting 
the state of any of the existing GPRs. 

These shadow registers are only used for servicing a TLB miss. 

See Section 1. 3.6.2, “PowerPC 603e Microprocessor Memory Management,” for more 
information about memory management for the 603e. 

1.1 .6.2 Cache Units 

The 603e provides independent 16-Kbyte, four-way set-associative instruction and data 
caches. The cache line size is 32 bytes in length. The caches are designed to adhere to a 
write-back policy, but the 603e allows control of cacheability, write policy, and memory 
coherency at the page and block levels. The caches use a least recently used (LRU) 
replacement policy. 
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As shown in Figure l-l, the caches provide a 64-bit interface to the instruction fetch unit 
and load/store unit. The surrounding logic selects, organizes, and forwards the requested 
information to the requesting unit. Write operations to the cache can be performed on a byte 
basis, and a complete read-modify-write operation to the cache can occur in each cycle. 

The load/store and instruction fetch units provide the caches with the address of the data or 
instruction to be fetched. In the case of a cache hit, the cache returns two words to the 
requesting unit. 

Since the 603e data cache tags are single ported, simultaneous load or store and snoop 
accesses cause resource contention. Snoop accesses have the highest priority and are given 
first access to the tags, unless the snoop access coincides with a tag write, in which case the 
snoop is retried and must re-arbitrate for access to the cache. Loads or stores that are 
deferred due to snoop accesses are executed on the clock cycle following the snoop. 

1.1.7 Processor Bus Interface 

Because the caches on the 603e are on-chip, write-back caches, the predominant type of 
transaction for most applications is burst-read memory operations, followed by burst-write 
memory operations, and single-beat (noncacheable or write-through) memory read and 
write operations. Additionally, there can be address-only operations, variants of the burst 
and single-beat operations, (for example, global memory operations that are snooped and 
atomic memory operations), and address retry activity (for example, when a snooped read 
access hits a modified line in the cache). 

Memory accesses can occur in single-beat (1-8 bytes) and four-beat burst (32 bytes) data 
transfers when the bus is configured as 64 bits, and in single-beat (1-4 bytes), two-beat (8 
bytes), and eight-beat (32 bytes) data transfers when the bus is configured as 32 bits. The 
address and data buses operate independently to support pipelining and split transactions 
during memory accesses. The 603e can pipeline its own transactions to a depth of one level. 

Access to the system interface is granted through an external arbitration mechanism that 
allows devices to compete for bus mastership. This arbitration mechanism is flexible, 
allowing the 603e to be integrated into systems that implement various fairness and bus 
parking procedures to avoid arbitration overhead. 

Typically, memory accesses are weakly ordered — sequences of operations, including 
load/store string and multiple instructions, do not necessarily complete in the order they 
begin — maximizing the efficiency of the bus without sacrificing coherency of the data. The 
603e allows read operations to precede store operations (except when a dependency exists, 
or in cases where a non-cacheable access is performed), and provides support for a write 
operation to proceed a previously queued read data tenure (for example, allowing a snoop 
push to be enveloped by the address and data tenures of a read operation). Because the 
processor can dynamically optimize run-time ordering of load/store traffic, overall 
performance is improved. 
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1.1.8 System Support Functions 

The 603e implements several support functions that include power management, time 
base/decrementer registers for system timing tasks, an IEEE 1149.1(JTAG)/common 
on-chip processor (COP) test interface, and a phase-locked loop (PLL) clock multiplier. 
These system support functions are described in the following subsections. 

1 .1 .8.1 Power Management 

The 603e provides four power modes selectable by setting the appropriate control bits in 
the machine state register (MSR) and hardware implementation register 0 (HIDO) registers. 
The four power modes are as follows: 

• Full-power-This is the default power state of the 603e. The 603e is fully powered 
and the internal functional units are operating at the full processor clock speed. If the 
dynamic power management mode is enabled, functional units that are idle will 
automatically enter a low-power state without affecting performance, software 
execution, or external hardware. 

• Doze~All the functional units of the 603e are disabled except for the time 
base/decrementer registers and the bus snooping logic. When the processor is in 
doze mode, an external asynchronous interrupt, a system management interrupt, a 
decrementer exception, a hard or soft reset, or machine check brings the 603e into 
the full-power state. The 603e in doze mode maintains the PLL in a fully powered 
state and locked to the system external clock input (SYSCLK) so a transition to the 
full-power state takes only a few processor clock cycles. 

• Nap-The nap mode further reduces power consumption by disabling bus snooping, 
leaving only the time base register and the PLL in a powered state. The 603e returns 
to the full-power state upon receipt of an external asynchronous interrupt, a system 
management interr upt, a decrementer exception, a hard or soft reset, or a machine 
check input (MCP). A return to full-power state from a nap state takes only a few 
processor clock cycles. 

• Sleep-Sleep mode reduces power consumption to a minimum by disabling all 
internal functional units, after which external system logic may disable the PLL and 
SYSCLK. Returning the 603e to the full-power state requires the enabling of the 
PLL and SYSCLK, followed by the assertion of an external asynchronous int errupt , 
a system management interrupt, a hard or soft reset, or a machine check input (MCP) 
signal after the time required to relock the PLL. 

1.1 .8.2 Time Base/Decrementer 

The time base is a 64-bit register (accessed as two 32-bit registers) that is incremented once 
every four bus clock cycles; external control of the time base is provided through the time 
base enable (TBEN) signal. The decrementer is a 32-bit register that generates a 
decrementer interrupt exception after a programmable delay. The contents of the 
decrementer register are decremented once every four bus clock cycles, and the 
decrementer exception is generated as the count passes through zero. 
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1.1. 8.3 IEEE 1149.1 (JTAG)/COP Test Interface 

The 603e provides IEEE 1149.1 and COP functions for facilitating board testing and chip 
debug. The IEEE 1 149. 1 test interface provides a means for boundary-scan testing the 603e 
and the board to which it is attached. The COP function shares the IEEE 1149.1 test port, 
provides a means for executing test routines, and facilitates chip and software debugging. 

1.1. 8.4 Clock Multiplier 

The internal clocking of the 603e is generated from and synchronized to the external clock 
signal, SYSCLK, by means of a voltage-controlled oscillator-based PLL. The PEL 
provides programmable internal processor clock rates of lx, 1.5x, 2x, 2.5x, 3x, 3.5x, and 
4x multiples of the externally supplied clock frequency. The bus clock is the same 
frequency and is synchronous with SYSCLK. The configuration of the PLL can be read by 
software from the hardware implementation register 1. 

1 .2 Levels of the PowerPC Architecture 

The PowerPC architecture consists of the following layers, and adherence to the PowerPC 
architecture can be measured in terms of which of the following levels of the architecture 
is implemented: 

• PowerPC user instruction set architecture (UISA) — ^Defines the base user-level 
instruction set, user-level registers, data types, floating-point exception model, 
memory models for a uniprocessor environment, and programming model for a 
uniprocessor environment. 

• PowerPC virtual environment architecture (VEA) — ^Describes the memory model 
for a multiprocessor environment, defines cache control instructions, and describes 
other aspects of virtual environments. Implementations that conform to the VEA 
also adhere to the UISA, but may not necessarily adhere to the OEA. 

• PowerPC operating environment architecture (OEA) — ^Defines the memory 
management model, supervisor-level registers, synchronization requirements, and 
the exception model. Implementations that conform to the OEA also adhere to the 
UISA and the VEA. 

The PowerPC architecture allows a wide range of designs for such features as cache and 
system interface implementations. 

1 .3 PowerPC 603e Microprocessor: 
Implementation-Specific Information 

The PowerPC architecture is derived from the IBM POWER Architecture’^ (Performance 
Optimized with Enhanced RISC architecture). The PowerPC architecture shares the 
benefits of the POWER architecture optimized for single-chip implementations. The 
PowerPC architecture design facilitates parallel instruction execution and is scalable to take 
advantage of future technological gains. 
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This section describes the PowerPC architecture in general, and specific details about the 
implementation of the 603e as a low-power, 32-bit member of the PowerPC processor 
family. The main topics addressed are as follows: 

• Features — Section 1.3.1, “Features,” describes general features that the 603e shares 
with the PowerPC microprocessor family. 

• Registers and programming model — Section 1.3.2, “PowerPC Registers and 
Programming Model,” describes the registers for the operating environment 
architecture common among PowerPC processors and describes the programming 
model. It also describes the additional registers that are unique to the 603e. 

• Instruction set and addressing modes — Section 1.3.3, “Instruction Set and 
Addressing Modes,” describes the PowerPC instruction set and addressing modes 
for the PowerPC operating environment architecture, and defines and describes the 
PowerPC instructions implemented in the 603e. 

• Cache implementation — Section 1.3.4, “Cache Implementation,” describes the 
cache model that is defined generally for PowerPC processors by the virtual 
environment architecture. It also provides specific details about the 603e cache 
implementation. 

• Exception model — Section 1.3,5, “Exception Model,” describes the exception 
model of the PowerPC operating environment architecture and the differences in the 
603e exception model. 

• Memory management — Section 1.3.6, “Memory Management,” describes 
generally the conventions for memory management among the PowerPC 
processors. This section also describes the 603e’s implementation of the 32-bit 
PowerPC memory management specification. 

• Instruction timing- — Section 1 .3.7, “Instruction Timing,” provides a general 
description of the instruction timing provided by the superscalar, parallel execution 
supported by the PowerPC architecture and the 603e. 

• System interface — Section 1.3.8, “System Interface,” describes the signals 
implemented on the 603e. 

1.3.1 Features 

The 603e is a high-performance, superscalar PowerPC microprocessor. The PowerPC 
architecture allows optimizing compilers to schedule instructions to maximize 
performance through efficient use of the PowerPC instruction set and register model. The 
multiple, independent execution units allow compilers to optimize instruction throughput. 
Compilers that take advantage of the flexibility of the PowerPC architecture can 
additionally optimize system performance of the PowerPC processors. 

Specific features of the 603e are listed in Section 1.1.1, “PowerPC 603e Microprocessor 
Features.” 
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1.3.2 PowerPC Registers and Programming Model 

The PowerPC architecture defines register-to-register operations for most computational 
instructions. Source operands for these instructions are accessed from the registers or are 
provided as immediate values embedded in the instruction opcode. The three-register 
instruction format allows specification of a target register distinct from the two source 
operands. Load and store instructions transfer data between registers and memory. 

PowerPC processors have two levels of privilege — supervisor mode of operation (typically 
used by the operating system) and user mode of operation (used by the application 
software). The programming models incorporate 32 GPRs, 32 FPRs, special-purpose 
registers (SPRs), and several miscellaneous registers. Each PowerPC microprocessor also 
has its own unique set of hardware implementation (HID) registers. 

Having access to privileged instructions, registers, and other resources allows the operating 
system to control the application environment (providing virtual memory and protecting 
operating-system and critical machine resources). Instructions that control the state of the 
processor, the address translation mechanism, and supervisor registers can be executed 
only when the processor is operating in supervisor mode. 

The following sections summarize the PowerPC registers that are implemented in the 603e. 

1. 3.2.1 General-Purpose Registers (GPRs) 

The PowerPC architecture defines 32 user-level, general-purpose registers (GPRs). These 
registers are either 32 bits wide in 32-bit PowerPC microprocessors and 64 bits wide in 
64-bit PowerPC microprocessors. The GPRs serve as the data source or destination for all 
integer instructions. 

1. 3.2.2 Floating-Point Registers (FPRs) 

The PowerPC architecture also defines 32 user-level, 64-bit floating-point registers (FPRs). 
The FPRs serve as the data source or destination for floating-point instructions. These 
registers can contain data objects of either single- or double-precision floating-point 
formats. 

1. 3.2.3 Condition Register (CR) 

The CR is a 32-bit user-level register that consists of eight four-bit fields that reflect the 
results of certain operations, such as move, integer and floating-point compare, arithmetic, 
and logical instructions, and provide a mechanism for testing and branching. 

1. 3.2.4 Floating-Point Status and Control Register (FPSCR) 

The floating-point status and control register (FPSCR) is a user-level register that contains 
all exception signal bits, exception summary bits, exception enable bits, and rounding 
control bits needed for compliance with the IEEE 754 standard. 
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1. 3.2.5 Machine state Register (MSR) 

The machine state register (MSR) is a supervisor-level register that defines the state of the 
processor. The contents of this register are saved when an exception is taken and restored 
when the exception handling completes. The 603e implements the MSR as a 32-bit register; 
64-bit PowerPC processors implement a 64-bit MSR. 

1. 3.2.6 Segment Registers (SRs) 

For memory management, 32-bit PowerPC microprocessors implement sixteen 32-bit 
segment registers (SRs). To speed access, the 603e implements the segment registers as two 
arrays; a main array (for data memory accesses) and a shadow array (for instruction 
memory accesses). Loading a segment entry with the Move to Segment Register (mtsr) 
instruction loads both arrays. 

1. 3.2.7 Special-Purpose Registers (SPRs) 

The PowerPC operating environment architecture defines numerous special-purpose 
registers that serve a variety of functions, such as providing controls, indicating status, 
configuring the processor, and performing special operations. During normal execution, a 
program can access the registers, shown in Figure 1-2, depending on the program’s access 
privilege (supervisor or user, determined by the privilege-level (PR) bit in the MSR). Note 
that registers such as the GPRs and FPRs are accessed through operands that are part of the 
instructions. Access to registers can be explicit (that is, through the use of specific 
instructions for that purpose such as Move to Special-Purpose Register (mtspr) and Move 
from Special-Purpose Register (mfspr) instructions) or implicit, as the part of the execution 
of an instruction. Some registers are accessed both explicitly and implicitly 

In the 603e, all SPRs are 32 bits wide. 

1.3.2.7.1 User-Level SPRs 

The following 603e SPRs are accessible by user-level software: 

• Link register (LR) — ^The link register can be used to provide the branch target 
address and to hold the return address after branch and link instructions. The LR is 
32 bits wide in 32-bit implementations. 

• Count register (CTR) — The CTR is decremented and tested automatically as a result 
of branch-and-count instructions. The CTR is 32 bits wide in 32-bit 
implementations. 

• Integer exception register (XER) — ^The 32-bit XER contains the summary overflow 
bit, integer carry bit, overflow bit, and a field specifying the number of bytes to be 
transferred by a Load String Word Indexed (Iswx) or Store String Word Indexed 
(stswx) instruction. 
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1.3.2.7.2 Supervisor-Level SPRs 

The 603e also contains SPRs that can be accessed only by supervisor-level software. These 
registers consist of the following: 

• The 32-bit DSISR defines the cause of data access and alignment exceptions. 

• The data address register (DAR) is a 32-bit register that holds the address of an 
access after an alignment or DSI exception. 

• Decrementer register (DEC) is a 32-bit decrementing counter that provides a 
mechanism for causing a decrementer exception after a programmable delay. 

• The 32-bit SDRl specifies the page table format used in virtual-to-physical address 
translation for pages. (Note that physical address is referred to as real address in the 
architecture specification.) 

• The machine status save/restore register 0 (SRRO) is a 32-bit register that is used by 
the 603e for saving the address of the instruction that caused the exception, and the 
address to return to when a Return from Interrupt (rfi) instruction is executed. 

• The machine status save/restore register 1 (SRRl) is a 32-bit register used to save 
machine status on exceptions and to restore machine status when an rfi instruction 
is executed. 

• The 32-bit SPRG0-SPRG3 registers are provided for operating system use. 

• The external access register (EAR) is a 32-bit register that controls access to the 
external control facility through the External Control In Word Indexed (edwx) and 
External Control Out Word Indexed (ecowx) instructions. 

• The time base register (TB) is a 64-bit register that maintains the time of day and 
operates interval timers. The TB consists of two 32-bit fields — time base upper 
(TBU) and time base lower (TBL). 

• The processor version register (PVR) is a 32-bit, read-only register that identifies the 
version (model) and revision level of the PowerPC processor. 

• Block address translation (BAT) arrays — The PowerPC architecture defines 1 6 BAT 
registers, divided into four pairs of data B ATs (DB ATs) and four pairs of instruction 
BATs (IB ATs). See Figure 1-2 for a list of the SPR numbers for the BAT arrays. 

The following supervisor-level SPRs are implementation-specific to the 603e: 

• The DMISS and IMISS registers are read-only registers that are loaded 
automatically upon an instruction or data TLB miss. 

• The HASHl and HASH2 registers contain the physical addresses of the primary and 
secondary page table entry groups (PTEGs). 

• The ICMP and DCMP registers contain a duplicate of the first word in the page table 
entry (PTE) for which the table search is looking. 

• The required physical address (RPA) register is loaded by the processor with the 
second word of the correct PTE during a page table search. 
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• The hardware implementation (HIDO and MIDI) registers provide the means for 
enabling the 603e’s checkstops and features, and allows software to read the 
configuration of the PLL configuration signals. 

• The instruction address breakpoint register (lABR) is loaded with an instruction 
address that is compared to instruction addresses in the dispatch queue. When an 
address match occurs, an instruction address breakpoint exception is generated. 

Figure 1-2 shows all the 603e registers available at the user and supervisor level. The 
numbers to the right of the SPRs indicate the number that is used in the syntax of the 
instruction operands to access the register. 
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SUPERVISOR MODEL 



USER MODEL 

General-Purpose 

Registers 




Floating-Point 

Registers 




Condition Register 



Floating-Point Status 
ami Control Register 



Hardware Configuration Registers 

Implementation Machine State Processor Version 



Registers^ 



Register 



Register 

PVR 1 SPR 287 



Memory Management Registers 



Instruction BAT 
Registers 

IBATOU I SPR 528 

IBATOL SPR 529 

IBAT1 U SPR 530 

IBAT1L SPR 531 

IBAT2U SPR 532 

IBAT2L SPR 533 

IBAT3U SPR 534 

IBAT3L SPR 535 



SDR1 SPR 25 



Data BAT Registers 

DBATOU SPR 536 

DBATOL SPR 537 

DBAT1U SPR 538 

DBAT1L SPR 539 

DBAT2U SPR 540 

DBAT2L SPR 541 

DBAT3U SPR 542 

DBAT3L SPR 543 



Software Tabie 
Search Registers^ 

DMISS SPR 976 

DCMP SPR 977 

HASH1 SPR 978 

HASH2 SPR 979 

iMiSS SPR 980 

ICMP SPR 981 

RPA SPR 982 



Segment Registers 




Exception Handling Registers 

Data Address Register DSI 



Save and Restore 



XER SPR 1 



Link Register 

LR 1 SPR 8 



SPRGO SPR 272 
SPRG1 SPR 273 



SPRG2 SPR 274 
SPRG3 SPR 275 



Count Register 



Miscellaneous Registers 

Time Base Facility 
(For Writing) 



Decrementer 



Time Base Facility 
(For Reading) 



Instruction Address 
Breakpoint Register^ 



External Address 
Register (Optional) 



^ These registers are 603e-specific registers. They may not be supported by other PowerPC processors. 



Figure 1-2. PowerPC 603e Microprocessor Programming Modei — Registers 
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1.3.3 Instruction Set and Addressing Modes 

The following subsections describe the PowerPC instruction set and addressing modes in 
general. 

1. 3.3.1 PowerPC Instruction Set and Addressing Modes 

All PowerPC instructions are encoded as single-word (32-bit) opcodes. Instruction formats 
are consistent among all instruction types, permitting efficient decoding to occur in parallel 
with operand accesses. This fixed instruction length and consistent format greatly 
simplifies instruction pipelining. 

1. 3.3.1. 1 PowerPC Instruction Set 

The PowerPC instructions are divided into the following categories: 

• Integer instructions — These include computational and logical instructions. 

— Integer arithmetic instructions 

— Integer compare instructions 
— Integer logical instructions 
— Integer rotate and shift instructions 

• Floating-point instructions — ^These include floating-point computational 
instructions, as well as instructions that affect the FPSCR. 

— Floating-point arithmetic instructions 
— Floating-point multiply/add instructions 
— Floating-point rounding and conversion instructions 
— Floating-point compare instructions 
— Floating-point status and control instructions 

• Load/store instructions — These include integer and floating-point load and store 
instructions. 

— Integer load and store instructions 
— Integer load and store multiple instructions 
— Floating-point load and store 

— Primitives used to construct atomic memory operations (Iwarx and stwcx. 
instructions) 

• Flow control instructions — ^These include branching instructions, condition register 
logical instructions, trap instructions, and other instructions that affect the 
instruction flow. 

— Branch and trap instructions 
— Condition register logical instructions 
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• Processor control instructions — ^These instructions are used for synchronizing 
memory accesses and management of caches, TLBs, and the segment registers. 

— Move to/from SPR instructions 
— Move to/from MSR 
— Synchronize 
— Instruction synchronize 

• Memory control instructions — These instructions provide control of caches, TLBs, 
and segment registers. 

— Supervisor-level cache management instructions 
— User-level cache instructions 
— Segment register manipulation instructions 
— Translation lookaside buffer management instructions 

Note that this grouping of the instructions does not indicate which execution unit executes 
a particular instruction or group of instructions. 

Integer instructions operate on byte, half-word, and word operands. Floating-point 
instructions operate on single-precision (one word) and double-precision (one double 
word) floating-point operands. The PowerPC architecture uses instructions that are four 
bytes long and word-aligned. It provides for byte, half-word, and word operand loads and 
stores between memory and a set of 32 GPRs. It also provides for word and double-word 
operand loads and stores between memory and a set of 32 floating-point registers (FPRs). 

Computational instructions do not modify memory. To use a memory operand in a 
computation and then modify the same or another memory location, the memory contents 
must be loaded into a register, modified, and then written back to the target location with 
distinct instructions. 

PowerPC processors follow the program flow when they are in the normal execution state. 
However, the flow of instructions can be interrupted directly by the execution of an 
instruction or by an asynchronous event. Either kind of exception may cause one of several 
components of the system software to be invoked. 

1. 3.3.1. 2 Calculating Effective Addresses 

The effective address (EA) is the 32-bit address computed by the processor when executing 
a memory access or branch instruction or when fetching the next sequential instruction. 

The PowerPC architecture supports two simple memory addressing modes: 

• EA = (rAIO) + offset (including offset = 0) (register indirect with immediate index) 

• EA = (rAIO) + rB (register indirect with index) 

These simple addressing modes allow efficient address generation for memory accesses. 
Calculation of the effective address for aligned transfers occurs in a single clock cycle. 
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For a memory access instruction, if the sum of the effective address and the operand length 
exceeds the maximum effective address, the memory operand is considered to wrap around 
from the maximum effective address to effective address 0. 

Effective address computations for both data and instruction accesses use 32-bit unsigned 
binary arithmetic. A carry from bit 0 is ignored in 32-bit implementations. 

1. 3.3.2 PowerPC 603e Microprocessor Instruction Set 

The 603e instruction set is defined as follows: 

• The 603e provides hardware support for all 32-bit PowerPC instructions. 

• The 603e provides two implementation-specific instructions used for software table 
search operations following TLB misses: 

*- Load Data TLB Entry (tlbld) 

- Load Instruction TLB Entry (tlbli) 

• The 603e implements the following instructions which are defined as optional by the 
PowerPC architecture: 

- External Control In Word Indexed (edwx) 

- External Control Out Word Indexed (ecowx) 

- Floating Select (fsel) 

- Floating Reciprocal Estimate Single-Precision (fres) 

- Floating Reciprocal Square Root Estimate (frsqrte) 

- Store Floating-Point as Integer Word (stfiwx) 

1.3.4 Cache Implementation 

The following subsections describe the PowerPC architecture’s treatment of cache in 
general, and the 603e-specific implementation, respectively. 

1. 3.4.1 PowerPC Cache Characteristics 

The PowerPC architecture does not define hardware aspects of cache implementations. For 
example, some PowerPC processors, including the 603e, have separate instruction and data 
caches (Harvard architecture), while others, such as the PowerPC 601™ microprocessor, 
implement a unified cache. 

PowerPC microprocessors control the following memory access modes on a page or block 
basis: 

• Write-back/write-through mode 

• Caching-inhibited mode 

• Memory coherency 

• Guarded 
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Note that in the 603e, a cache line is defined as eight words. The VEA defines cache 
management instructions that provide a means by which the application programmer can 
affect the cache contents. 

1 -3.4-2 PowerPC 603e Microprocessor Cache Implementation 

The 603e has two 16-Kbyte, four-way set-associative (instruction and data) caches. The 
caches are physically addressed, and the data cache can operate in either write-back or 
write-through mode as specified by the PowerPC architecture. 

The data cache is configured as 128 sets of 4 lines each. Each line consists of 32 bytes, two 
state bits, and an address tag. The two state bits implement the three-state MEI 
(modified/exclusive/invalid) protocol. Each line contains eight 32-bit words. Note that the 
PowerPC architecture defines the term block as the cacheable unit. For the 603 e, the block 
size is equivalent to a cache line. A block diagram of the data cache organization is shown 
in Figure 1-3. 

The instruction cache also consists of 128 sets of 4 lines, and each line consists of 32 bytes, 
an address tag, and a valid bit. The instruction cache may not be written to except through 
a line fill operation. The instruction cache is not snooped, and cache coherency must be 
maintained by software. A fast hardware invalidation capability is provided to support 
cache maintenance. The organization of the instruction cache is very similar to the data 
cache shown in Figure 1-3. 

Each cache line contains eight contiguous words from memory that are loaded from an 
8-word boundary (that is, bits A27-A3 1 of the effective addresses are zero); thus, a cache 
line never crosses a page boundary. Misaligned accesses across a page boundary can incur 
a performance penalty. 

The 603e’s cache lines are loaded in four beats of 64 bits each when the processor bus is in 
64-bit mode. The burst load is performed as “critical double word first.” The cache that is 
being loaded is blocked to internal accesses until the load completes. The critical double 
word is simultaneously written to the cache and forwarded to the requesting unit, thus 
minimizing stalls due to load delays. 

The 603e’s cache lines are loaded in eight beats of 32 bits each when the processor is in 
32-bit bus mode. For more information see Section 8.6.1, “32-Bit Data Bus Mode.” 

To ensure coherency among caches in a multiprocessor (or multiple caching-device) 
implementation, the 603e implements the MEI protocol. These three states, modified, 
exclusive, and invalid, indicate the state of the cache block as follows: 

• Modified — The cache line is modified with respect to system memory; that is, data 
for this address is valid only in the cache and not in system memory. 

• Exclusive — This cache line holds valid data that is identical to the data at this 
address in system memory. No other cache has this data. 

• Invalid — This cache line does not hold valid data. 
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Cache coherency is enforced by on-chip bus snooping logic. Since the 603e’s data cache 
tags are single ported, a simultaneous load or store and snoop access represents a resource 
contention. The snoop access is given first access to the tags. The load or store then occurs 
on the clock following the snoop. 

The 603e cache-coherency protocol is a subset of the standard MESI four-state protocol. 
To ensure coherency, all reads to the bus are done as read-with-intent-to-modify (RWITM). 
This causes a MESI device to convert to the MEI subset. 




Figure 1-3. Data Cache Organization 



1.3.5 Exception Model 

The following subsections describe the PowerPC exception model and the 603e 
implementation, respectively. 

1. 3.5.1 PowerPC Exception Model 

The PowerPC exception mechanism allows the processor to change to supervisor state as 
a result of external signals, errors, or unusual conditions arising in the execution of 
instructions. When exceptions occur, information about the state of the processor is saved 
to certain registers and the processor begins execution at an address (exception vector) 
predetermined for each exception. Processing of exceptions occurs in supervisor mode. 

Although multiple exception conditions can map to a single exception vector, a more 
specific condition may be determined by examining a register associated with the 
exception — ^for example, the DSISR and the FPSCR. Additionally, some exception 
conditions can be explicitly enabled or disabled by software. 

The PowerPC architecture requires that exceptions be handled in program order; therefore, 
although a particular implementation may recognize exception conditions out of order, they 
are presented strictly in order. When an instruction-caused exception is recognized, any 
unexecuted instructions that appear earlier in the instruction stream, including any that have 
not yet entered the execute state, are required to complete before the exception is taken. 
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Any exceptions caused by those instructions are handled first. Likewise, exceptions that are 
asynchronous and precise are recognized when they occur, but are not handled until the 
instruction currently in the completion stage successfully completes execution or generates 
an exception, and the completed store queue is emptied. 

Unless a catastrophic condition causes a system reset or machine check exception, only one 
exception is handled at a time. If, for example, a single instruction encounters multiple 
exception conditions, those conditions are handled sequentially. After the exception 
handler handles an exception, the instruction execution continues until the next exception 
condition is encountered. However, in many cases there is no attempt to re-execute the 
instruction. This method of recognizing and handling exception conditions sequentially 
guarantees that exceptions are recoverable. 

Exception handlers should save the information stored in SRRO and SRRl early to prevent 
the program state from being lost due to a system reset and machine check exception or to 
an instruction-caused exception in the exception handler, and before enabling external 
interrupts. 

The PowerPC architecture supports four types of exceptions: 

• Synchronous, precise — ^These are caused by instructions. All instruction-caused 
exceptions are handled precisely; that is, the machine state at the time the exception 
occurs is known and can be completely restored. This means that (excluding the trap 
and system call exceptions) the address of the faulting instruction is provided to the 
exception handler and that neither the faulting instruction nor subsequent 
instructions in the code stream will complete execution before the exception is 
taken. Once the exception is processed, execution resumes at the address of the 
faulting instruction (or at an alternate address provided by the exception handler). 
When an exception is taken due to a trap or system call instruction, execution 
resumes at an address provided by the handler. 

• Synchronous, imprecise — The PowerPC architecture defines two imprecise 
floating-point exception modes, recoverable and nonrecoverable. Even though the 
603e provides a means to enable the imprecise modes, it implements these modes 
identically to the precise mode (that is, all enabled floating-point enabled exceptions 
are always precise on the 603e). 

• Asynchronous, maskable — ^The external, SMI, and decrementer interrupts are 
maskable asynchronous exceptions. When these exceptions occur, their handling is 
postponed until the next instruction, and any exceptions associated with that 
instruction, completes execution. If there are no instructions in the execution units, 
the exception is taken immediately upon determination of the correct restart address 
(for loading SRRO). 

• Asynchronous, nonmaskable — ^There are two nonmaskable asynchronous 
exceptions: system reset and the machine check exception. These exceptions may 
not be recoverable, or may provide a limited degree of recoverability. All exceptions 
report recoverability through the MSR[RI] bit. 



Chapter 1. Overview 



1-27 




1. 3.5.2 PowerPC 603e Microprocessor Exception Model 

As specified by the PowerPC architecture, all 603e exceptions can be described as either 
precise or imprecise and either synchronous or asynchronous. Asynchronous exceptions 
(some of which are maskable) are caused by events external to the processor’s execution; 
synchronous exceptions, which are all handled precisely by the 603e, are caused by 
instructions. The 603e exception classes are shown in Table 1-4. 



Table 1-4. PowerPC 603e Microprocessor Exception Classifications 



Synchronous/Asynchronous 


Precise/Imprecise 


Exception Type 


Asynchronous, nonmaskable 


Imprecise 


Machine check 
System reset 


Asynchronous, maskable 


Precise 


External interrupt 
Decrementer 

System management interrupt 


Synchronous 


Precise 


Instruction-caused exceptions 



Although exceptions have other characteristics as well, such as whether they are maskable 
or nonmaskable, the distinctions shown in Table 1-4 define categories of exceptions that the 
603e handles uniquely. Note that Table 1-4 includes no synchronous imprecise instructions. 
While the PowerPC architecture supports imprecise handling of floating-point exceptions, 
the 603e implements these exception modes as precise exceptions. 

The 603e’s exceptions, and conditions that cause them, are listed in Table 1-5. Exceptions 
that are specific to the 603e are indicated. 



Table 1-5. Exceptions and Conditions 



Exception 

Type 


Vector Offset 
(hex) 


Causing Conditions 


Reserved 


00000 


— 


System reset 


00100 


A system reset is caused by the assertion of either SRESET or H RESET. 


Machine check 


00200 


A machine check is caused by the assertion of the TEA signal during a data 
bus transaction, assertion of MCP, APE, and DPE. 


DSI 


00300 


The cause of a DSI exception can be determined by the bit settings in the 

DSISR, listed as follows: 

I Set if the translation of an attempted access Is not found In the primary 
hash table entry group (HTEG), or in the rehashed secondary HTEG, or in 
the range of a DBAT register; otherwise cleared. 

4 Set If a memory access is not permitted by the page or DBAT protection 
mechanism; otherwise cleared. 

5 Set if memory access Is attempted to a direct-store segment; otherwise 
cleared. 

6 Set for a store operation and cleared for a load operation. 

II Set If eciwx or ecowx is used and EAR[E] is cleared. 
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Table 1-5. Exceptions and Conditions (Continued) 



Exception 

Type 


Vector Offset 
(hex) 


Causing Conditions 


iSI 


00400 


An ISI exception is caused when an instruction fetch cannot be performed for 

any of the following reasons: 

• The effective (logical) address cannot be translated. That is, there is a page 
fault for this portion of the translation, so an ISI exception must be taken to 
load the PTE (and possibly the page) into memory. 

• The fetch access is to a direct-store segment. 

• The fetch access violates memory protection. If the key bits (Ks and Kp) In 
the segment register and the PP bits in the PTE are set to prohibit read 
access, instructions cannot be fetched from this location. 


External 

interrupt 


00500 


An external interrupt is caused when MSR[EE] = 1 and the TRT signal is 
asserted. 


Alignment 


00600 


An alignment exception is caused when the 603e cannot perform a memory 

access for any of the following reasons: 

• The operand of a floating-point load or store is not word-aligned. 

• The operand of Imw, stmw, Iwarx, or stwcx. is not word-aligned. 

• The operand of dcbz is in a page that Is write-through or caching-inhibited 
for a virtual mode access. 

• The operand of an elementary, multiple or string load or store crosses a 
segment boundary with a change to the direct-store T bit. 

• A little-endian access Is misaligned, or a multiple access is attempted with 
the little-endian bit set. 


Program 


00700 

' 


A program exception is caused by one of the following exception conditions, 
which correspond to bit settings in SRR1 and arise during execution of an 
instruction: 

• Floating-point enabled exception — A floating-point enabled exception 
condition is generated when the following condition is met: 

(MSR[FE01 1 MSR[FE1]) & FPSCR[FEX] Is 1. 

FPSCR[FEX] is set by the execution of a floating-point instruction that 
causes an enabled exception or by the execution of one of the “move to 
FPSCR" instructions that results in both an exception condition bit and its 
corresponding enable bit being set in the FPSCR. 

• Illegal instruction — An illegal instruction program exception is generated 
when execution of an instruction Is attempted with an illegal opcode or 
Illegal combination of opcode and extended opcode fields (including 
PowerPC instructions not implemented in the 603e. These do not include 
those optional instructions treated as no-ops). 

• Privileged instruction — A privileged Instruction type program exception is 
generated when the execution of a privileged instruction is attempted and 
the MSR register user privilege bit, MSR[PR], is set. In the 603e, this 
exception is generated for mtspr or mfspr with an invalid SPR field If 
SPR[0] = 1 and MSR[PR] = 1 . This may not be true for all PowerPC 
processors. 

• Trap — A trap type program exception Is generated when any of the 
conditions specified in a trap Instruction is met. 


Floating-point 

unavailable 


00800 


A floating-point unavailable exception Is caused by an attempt to execute a 
floating-point instruction (including floating-point load, store, and move 
Instructions) when the floating-point available bit is disabled, (MSR[FP] = 0). 


Decrementer 


00900 


The decrementer exception occurs when the most significant bit of the 
decrementer (DEC) register transitions from 0 to 1 . The decrementer 
exception must also be enabled with the MSR[EE] bit. 
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Table 1-5. Exceptions and Conditions (Continued) 



Exception 

Type 


Vector Offset 
(hex) 


Causing Conditions 


Reserved 


00A0(M)0BFF 


— 


System call 


OOCOO 


A system call exception occurs when a System Call (sc) instruction is 
executed. 


Trace 


OODOO 


A trace exception is taken when MSR[SE] =1 or when the currently completing 
instruction Is a branch and MSR[BE] =1 . 


Floating-point 

assist 


OOEOO 


Not implemented In the 603e. 


Reserved 


00E1O-00FFF 


— 


Instruction 
translation miss 


01000 


An instruction translation miss exception is caused when an effective address 
for an Instruction fetch cannot be translated by the ITLB. 


Data load 
translation miss 


01100 


A data load translation miss exception is caused when an effective address for 
a data load operation cannot be translated by the DTLB. 


Data store 
translation miss 


01200 


A data store translation miss exception is caused when an effective address 
for a data store operation cannot be translated by the DTLB; or when a DTLB 
hit occurs and the change bit in the PTE must be set due to a data store 
operation. 


Instruction 

address 

breakpoint 


01300 


An Instruction address breakpoint exception occurs when the address (bits 
0-29) in the lABR matches the next instruction to complete in the completion 
unit and the lABR enable bit (bit 30) is set. 


System 

management 

interrupt 


01400 


A system management interrupt is caused when MSR[EE] =1 and the SMI 
input signal is asserted. 


Reserved 


01500-02FFF 


— 



1 .3.6 Memory Management 

The following subsections describe the memory management features of the PowerPC 
architecture, and the 603e implementation, respectively. 

1. 3.6.1 PowerPC Memory Management 

The primary functions of the MMU are to translate logical (effective) addresses to physical 
addresses for memory accesses, and to provide access protection on blocks and pages of 
memory. 

There are two types of accesses generated by the 603e that require address translation — 
instruction accesses, and data accesses to memory generated by load and store instructions. 

The PowerPC MMU and exception model support demand-paged virtual memory. Virtual 
memory management permits execution of programs larger than the size of physical 
memory; demand-paged implies that individual pages are loaded into physical memory 
from system memory only when they are first accessed by an executing program. 
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The hashed page table is a variable-sized data structure that defines the mapping between 
virtual page numbers and physical page numbers. The page table size is a power of 2, and 
its starting address is a multiple of its size. 

The page table contains a number of page table entry groups (PTEGs). A PTEG contains 
eight page table entries (PTEs) of eight bytes each; therefore, each PTEG is 64 bytes long. 
PTEG addresses are entry points for table search operations. 

Address translations are enabled by setting bits in the MSR — ^MSR[IR] enables instruction 
address translations and MSR[DR] enables data address translations. 

1. 3.6.2 PowerPC 603e Microprocessor Memory Management 

The instruction and data memory management units in the 603e provide 4 Gbytes of logical 
address space accessible to supervisor and user programs with a 4-Kbyte page size and 
256-Mbyte segment size. Block sizes range from 128 Kbyte to 256 Mbyte and are software 
selectable. In addition, the 603e uses an interim 52-bit virtual address and hashed page 
tables for generating 32-bit physical addresses. The MMUs in the 603e rely on the 
exception processing mechanism for the implementation of the paged virtual memory 
environment and for enforcing protection of designated memory areas. 

Instruction and data TLBs provide address translation in parallel with the on-chip cache 
access, incurring no additional time penalty in the event of a TLB hit. A TLB is a cache of 
the most recently used page table entries. Software is responsible for maintaining the 
consistency of the TLB with memory. The 603e’s TLBs are 64-entry, two-way 
set-associative caches that contain instruction and data address translations. The 603e 
provides hardware assist for software table search operations through the hashed page table 
on TLB misses. Supervisor software can invalidate TLB entries selectively. 

The 603e also provides independent four-entry BAT arrays for instructions and data that 
maintain address translations for blocks of memory. These entries define blocks that can 
vary from 128 Kbytes to 256 Mbytes. The BAT arrays are maintained by system software. 

As specified by the PowerPC architecture, the hashed page table is a variable-sized data 
structure that defines the mapping between virtual page numbers and physical page 
numbers. The page table size is a power of 2, and its starting address is a multiple of its size. 

Also as specified by the PowerPC architecture, the page table contains a number of page 
table entry groups (PTEGs). A PTEG contains eight page table entries (PTEs) of eight bytes 
each; therefore, each PTEG is 64 bytes long. PTEG addresses are entry points for table 
search operations. 
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1.3.7 Instruction Timing 

The 603e is a pipelined superscalar processor. A pipelined processor is one in which the 
processing of an instruction is reduced into discrete stages. Because the processing of an 
instruction is broken into a series of stages, an instruction does not require the entire 
resources of an execution unit. For example, after an instruction completes the decode 
stage, it can pass on to the next stage, while the subsequent instruction can advance into the 
decode stage. This improves the throughput of the instruction flow. For example, it may 
take three cycles for a floating-point instruction to complete, but if there are no stalls in the 
floating-point pipeline, a series of floating-point instructions can have a throughput of one 
instruction per cycle. 

The instruction pipeline in the 603e has four major pipeline stages, described as follows: 

• The fetch pipeline stage primarily involves retrieving instructions from the memory 
system and determining the location of the next instruction fetch. Additionally, the 
BPU decodes branches during the fetch stage and folds out branch instructions 
before the dispatch stage if possible. 

• The dispatch pipeline stage is responsible for decoding the instructions supplied by 
the instruction fetch stage, and determining which of the instructions are eligible to 
be dispatched in the current cycle. In addition, the source operands of the 
instructions are read from the appropriate register file and dispatched with the 
instruction to the execute pipeline stage. At the end of the dispatch pipeline stage, 
the dispatched instructions and their operands are latched by the appropriate 
execution unit. 

• During the execute pipeline stage each execution unit that has an executable 
instruction executes the selected instruction (perhaps over multiple cycles), writes 
the instruction's result into the appropriate rename register, and notifies the 
completion stage that the instruction has finished execution. In the case of an 
internal exception, the execution unit reports the exception to the 
completion/writeback pipeline stage and discontinues instruction execution until the 
exception is handled. The exception is not signaled until that instruction is the next 
to be completed. Execution of most floating-point instructions is pipelined within 
the FPU allowing up to three instructions to be executing in the FPU concurrently. 
The pipeline stages for the floating-point unit are multiply, add, and round-convert. 
Execution of load/store instructions is also pipelined. The load/store unit has two 
pipeline stages. The first stage is for effective address calculation and MMU 
translation and the second stage is for accessing the data in the cache. 

• The complete/writeback pipeline stage maintains the correct architectural machine 
state and transfers the contents of the rename registers to the GPRs and FPRs as 
instructions are retired. If the completion logic detects an instruction causing an 
exception, all following instructions are cancelled, their execution results in rename 
registers are discarded, and instructions are fetched from the correct instruction 
stream. 
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A superscalar processor is one that issues multiple independent instructions into multiple 
pipelines allowing instructions to execute in parallel. The 603e has five independent 
execution units, one each for integer instructions, floating-point instructions, branch 
instructions, load/store instructions, and system register instructions. The lU and the FPU 
each have dedicated register files for maintaining operands (GPRs and FPRs, respectively), 
allowing integer calculations and floating-point calculations to occur simultaneously 
without interference. 

Because the PowerPC architecture can be applied to such a wide variety of 
implementations, instruction timing among various PowerPC processors varies 
accordingly. 

1.3.8 System Interface 

The system interface is specific for each PowerPC microprocessor implementation. 

The 603e provides a versatile system interface that allows for a wide range of 
implementations. The interface includes a 32-bit address bus, a 32- or 64-bit data bus, and 
56 control and information signals (see Figure 1-4). The system interface allows for 
address-only transactions as well as address and data transactions. The 603e control and 
information signals include the address arbitration, address start, address transfer, transfer 
attribute, address termination, data arbitration, data transfer, data termination, and 
processor state signals. Test and control signals provide diagnostics for selected internal 
circuits. 




DATA 

DATA ARBITRATION 
DATA TRANSFER 
DATA TERMINATION 
PROCESSOR STATE 
TEST AND CONTROL 



Figure 1-4. System Interface 

The system interface supports bus pipelining, which allows the address tenure of one 
transaction to overlap the data tenure of another. The extent of the pipelining depends on 
external arbitration and control circuitry. Similarly, the 603e supports split-bus transactions 
for systems with multiple potential bus masters — one device can have mastership of the 
address bus while another has mastership of the data bus. Allowing multiple bus 
transactions to occur simultaneously increases the available bus bandwidth for other 
activity and as a result, improves performance. 
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The 603e supports multiple masters through a bus arbitration scheme that allows various 
devices to compete for the shared bus resource. The arbitration logic can implement priority 
protocols, such as fairness, and can park masters to avoid arbitration overhead. The MEI 
protocol ensures coherency among multiple devices and system memory. Also, the 603e's 
on-chip caches and TLBs and optional second-level caches can be controlled externally. 

The 603e’s clocking structure allows the bus to operate at integer multiples of the processor 
cycle time. 

The following sections describe the 603e bus support for memory operations. Note that 
some signals perform different functions depending upon the addressing protocol used. 

1 .3.8.1 Memory Accesses 

The 603e’s data bus is configured at power-up to either a 32- or 64-bit width. When the 
603e is configured with a 32-bit data bus, memory accesses allow transfer sizes of 8, 16, 
24, or 32 bits in one bus clock cycle. Data transfers occur in either single-beat transactions, 
or two-beat or eight-beat burst transactions, with a single-beat transaction transferring as 
many as 32 bits. Single- or double-beat transactions are caused by noncached accesses that 
access memory directly (that is, reads and writes when caching is disabled, 
caching-inhibited accesses, and stores in write-through mode). Eight-beat burst 
transactions, which always transfer an entire cache line (32 bytes), are initiated when aline 
is read from or written to memory. 

When the 603e is configured with a 64-bit data bus, memory accesses allow transfer sizes 
of 8, 16, 24, 32, 40, 48, 56, or 64 bits in one bus clock cycle. Data transfers occur in either 
single-beat transactions or four-beat burst transactions. Single-beat transactions are caused 
by noncached accesses that access memory directly (that is, reads and writes when caching 
is disabled, caching-inhibited accesses, and stores in write-through mode). Four-beat burst 
transactions, which always transfer an entire cache line (32 bytes), are initiated when a line 
is read from or written to memory. 

1. 3.8.2 PowerPC 603e Microprocessor Signals 

The 603e signals are grouped as follows: 

• Address arbitration signals — The 603e uses these signals to arbitrate for address bus 
mastership. 

• Address transfer start signals— These signals indicate that a bus master has begun a 
transaction on the address bus. 

• Address transfer signals — These signals, which consist of the address bus, address 
parity, and address parity error signals, are used to transfer the address and to ensure 
the integrity of the transfer. 

• Transfer attribute signals — ^These signals provide information about the type of 
transfer, such as the transfer size and whether the transaction is bursted, 
write-through, or caching-inhibited. 
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• Address transfer termination signals — ^These signals are used to acknowledge the 
end of the address phase of the transaction. They also indicate whether a condition 
exists that requires the address phase to be repeated. 

• Data arbitration signals — ^The 603e uses these signals to arbitrate for data bus 
mastership. 

• Data transfer signals — These signals, which consist of the data bus, data parity, and 
data parity error signals, are used to transfer the data and to ensure the integrity of 
the transfer. 

• Data transfer termination signals — ^Data termination signals are required after each 
data beat in a data transfer. In a single-beat transaction, the data termination signals 
also indicate the end of the tenure, while in burst accesses, the data termination 
signals apply to individual beats and indicate the end of the tenure only after the final 
data beat. They also indicate whether a condition exists that requires the data phase 
to be repeated. 

• System status signals — ^These signals include the interrupt signal, checkstop signals, 
and both soft- and hard-reset signals. These signals are used to interrupt and, under 
various conditions, to reset the processor. 

• Processor state signals — ^These signals indicate the state of the reservation 
coherency bit, enable the time base, provide machine quiesce control, and cause a 
machine halt on execution of a tlbsync instruction. 

• IEEE 1149.1(JTAG)/COP interface signals — ^The IEEE 1149.1 test unit and the 
common on-chip processor (COP) unit are accessed through a shared set of input, 
output, and clocking signals. The IEEE 1 149. 1/COP interface provides a means for 
boundary scan testing and internal debugging of the 603e. 

• Test interface signals — ^These signals are used for production testing. 

• Clock signals — These signals determine the system clock frequency. These signals 
can also be used to synchronize multiprocessor systems. 

NOTE 

A bar over a signa l name i ndicates that the signd is active 
low — for example, ARTRY (address retry) and TS (transfer 
start). Active-low signals are referred to as asserted (active) 
when they are low and negated when they are high. Signals that 
are not active low, such as AP0-AP3 (address bus parity 
signals) and TT0-TT4 (transfer type signals) are referred to as 
asserted when they are high and negated when they are low. 
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1.3.8.3 Signal Configuration 

Figure 1-5 illustrates the 603e’s logical pin configuration, showing how the signals are 
grouped. 
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Figure 1-5. PowerPC 603e Microprocessor Signal Groups 
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Chapter 2 

PowerPC 603e Microprocessor 
Programming Model 

This chapter describes the PowerPC programming model with respect to the PowerPC 603e 
microprocessor. It consists of three major sections that describe the following: 

• Registers implemented in the 603e 

• Operand conventions 

• The 603e instruction set 

2.1 The PowerPC 603e Processor Register Set 

This section describes the register organization in the 603e as defined by the three levels of 
the PowerPC architecture — the user instruction set architecture (UISA), the virtual 
environment architecture (VEA), and the operating environment architecture (OEA), as 
well as the 603e implementation-specific registers. Full descriptions of the basic register 
set defined by the PowerPC architecture are provided in Chapter 2, “PowerPC Register 
Set,” in The Programming Environments Manual. 

The PowerPC architecture defines register-to-register operations for all computational 
instructions. Source data for these instructions is accessed from the on-chip registers or is 
provided as an immediate value embedded in the opcode. The three-register instruction 
format allows specification of a target register distinct from the two source registers, thus 
preserving the original data for use by other instructions and reducing the number of 
instructions required for certain operations. Data is transferred between memory and 
registers with explicit load and store instructions only. 

Note that there may be registers common to other PowerPC processors that are not 
implemented in the 603e. When the 603e detects special-purpose register (SPR) encodings 
other than those defined in this document, it either takes an exception or it treats the 
instruction as a no-op. (Note that exceptions are referred to as interrupts in the architecture 
specification.) Conversely, some SPRs in the 603e may not be implemented in other 
PowerPC processors, or may not be implemented in the same way in other PowerPC 
processors. 
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2.1.1 PowerPC Register Set 

The PowerPC UISA registers, shown in Figure 2-1, can be accessed by either user- or 
supervisor-level instructions (the architecture specification refers to user- and supervisor- 
level as problem state and privileged state, respectively). The general-purpose registers 
(GPRs) and floating-point registers (FPRs) are accessed through instruction operands. 
Access to registers can be explicit (that is, through the use of specific instructions for that 
purpose such as the mtspr and mfspr instructions) or implicit as part of the execution (or 
side effect) of an instruction. Some registers are accessed both explicitly and implicitly. 

The number to the right of the register name indicates the number that is used in the syntax 
of the instruction operands to access the register (for example, the number used to access 
the XER is SPRl). 

For more information on the PowerPC register set, refer to Chapter 2, “PowerPC Register 
Set,” in The Programming Environments Manual. 
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SUPERVISOR MODEL 
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^ These registers are 603e-specific registers. They may not be supported by other PowerPC processors. 



Figure 2-1. PowerPC 603e Microprocessor Programming Model — Registers 
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The 603e’s user-level registers are described as follows: 

• User-level registers (UISA) — ^The user-level registers can be accessed by all 
software with either user or supervisor privileges. The user-level register set 
includes the following: 

— General-purpose registers (GPRs). The general-purpose register file consists of 
thirty-two 32-bit GPRs designated as GPR0-GPR3 1 . This register file serves as 
the data source or destination for all integer instructions and provides data for 
generating addresses. 

— Floating-point registers (FPRs). The floating-point register file consists of thirty- 
two 64-bit FPRs designated as FPR0-FPR31, which serves as the data source or 
destination for all floating-point instructions. These registers can contain data 
objects of either single- or double-precision floating-point format. 

— Condition register (CR). The GR is a 32-bit register, divided into eight 4-bit 
fields, CR0-CR7, that reflects the results of certain arithmetic operations and 
provides a mechanism for testing and branching. 

— - Floating-point status and control register (FPSCR). The FPSCR is a user-control 
register that contains all floating-point exception signal bits, exception summary 
bits, exception enable bits, and rounding control bits needed for compliance with 
the IEEE 754 standard. 

The remaining user-level registers are SPRs. Note that the PowerPC architecture 
provides a separate mechanism for accessing SPRs (the mtspr and mfspr 
instructions). These instructions are commonly used to explicitly access certain 
registers, while other SPRs may be more typically accessed as the side effect of 
executing other instructions. 

— XER register (XER). The XER is a 32-bit register that indicates overflow and 
carries for integer operations. It is set implicitly by many instructions. 

— Link register (LR). The 32-bit link register provides the branch target address for 
the Branch Conditional to Link Register (bclrx) instruction, and can optionally 
be used to hold the logical address (referred to as the effective address in the 
architecture specification) of the instruction that follows a branch and link 
instruction, typically used for linking to subroutines. 

— Count register (CTR). The CTR is a 32-bit register for holding a loop count that 
can be decremented during execution of appropriately coded branch instructions. 
The CTR can also provide the branch target address for the Branch Conditional 
to Count Register (bcctr;c) instruction. 

• User-level registers (VEA) — The PowerPC VEA introduces the time base facility 
(TB) for reading. The TB is a 64-bit register pair whose contents are incremented 
once every four bus clock cycles. The TB consists of two 32-bit registers — time base 
upper (TBU) and time base lower (TBL). Note that the time base registers are read- 
only when in user state. 
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The 603e’s user-level registers are described as follows: 

• Supervisor-level registers (OEA) — ^The OEA defines the registers that are used 
typically by an operating system for such operations as memory management, 
configuration, and exception handling. The supervisor-level registers defined by the 
PowerPC architecture for 32-bit implementations are described as follows: 

— Configuration registers 

- Machine state register (MSR). The MSR defines the state of the processor. 
The MSR can be modified by the Move to Machine State Register (mtmsr). 
System Call (sc), and Return from Exception (rfi) instructions. It can be read 
by the Move from Machine State Register (mfmsr) instruction. 

Implementation Note — The 603e defines MSR[13] as the power 
management enable (POW) bit and MSR[14] as the temporary GPR 
remapping (TGPR) bit. These additional bits are described in Table 2-1. 



Table 2-1. MSR[POW] and MSR[TGPR] Bits 



Bit 


Name 


Description 


13 


POW 


Power management enable (603e-specific) 

0 Disables programmable power modes (normal operation mode). 

1 Enables programmable power modes (nap, doze, or sleep mode). 

This bit controls the programmable power modes only, it has no effect on dynamic power 
management (DPM). MSR[POW] may be altered with an mtmsr instruction only. Also, when 
altering the POW bit, software may alter only this bit in the MSR and no others. The mtmsr 
instruction must be followed by a context-synchronizing instruction. 

See Chapter 9, “Power Management,” for more information on power management. 


14 


TGPR 


Temporary GPR remapping (603e-specific) 

0 Normal operation 

1 TGPR mode. GPR0-GPR3 are remapped to TGPR0-TGPR3 for use by TLB miss 
routines. 

The contents of GPR0-GPR3 remain unchanged while MSR[TGPR] = 1. Attempts to use 
GPR4-GPR31 with MSR[TGPR] = 1 yield undefined results. Overlays TGPR(0-3) over 
GPR(0-3) for use by TLB miss routines. When this bit is set, all instruction accesses to 
GPR0-GPR3 are mapped to TGPR0-TGPR3, respectively. The contents of GPR0-GPR3 are 
unchanged as long as this bit remains set. Attempts to use GPR4-GPR31 when this bit is set 
yields undefined results.The TGPR bit is set when either an instruction TLB miss, data read 
miss, or data write miss exception is taken. The TGPR bit is cleared by an rfi instruction. 



- Processor version register (PVR). This register is a read-only register that 
identifies the version (model) and revision level of the PowerPC processor. 

Implementation Note — The processor version number for the 603e is 06, 
and the first mask revision is 1.0; the PVR reads as 0x00060100. 

— Memory management registers 

- Block-address translation (BAT) registers. The 603e includes eight block- 
address translation registers (BATs), consisting of four pairs of instruction 

B ATs (IB ATOU-IB AT3U and IB ATOL-IB AT3L) and four pairs of data BATs 
(DBAT0U-DBAT3U andDBAT0L-DBAT3L). See Figure 2-1 for a list of the 
SPR numbers for the BAT registers. 
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- SDRl . The SDRl register specifies the page table base address used in 
virtual-to-physical address translation. (Note that physical address is referred 
to as real address in the architecture specification.) 

- Segment registers (SR). The PowerPC OEA defines sixteen 32-bit segment 
registers (SR0-SR15). Note that SRs are implemented on 32-bit 
implementations only. The fields in the segment register are interpreted 
differently depending on the value of bit 0. 

— Exception handling registers 

- Data address register (DAR). After a data access or an alignment exception, 
the DAR is set to the effective address generated by the faulting instruction. 

- SPRG0-SPRG3. The SPRG0-SPRG3 registers are provided for operating 
system use. 

- DSISR. The DSISR defines the cause of data access and alignment 
exceptions. 

- Machine status save/restore register 0 (SRRO). The SRRO is used to save 
machine status on exceptions and to restore machine status when an rfi 
instruction is executed. 

- Machine status save/restore register 1 (SRRl). The SRRl is used to save 
machine status on exceptions and to restore machine status when an rfi 
instruction is executed. 

Implementation Note — The 603e implements the KEY bit (bit 12) in the 
SRRl register in order to simplify the table search software. For more 
information refer to Chapter 5, “Memory Management.” 

— Miscellaneous registers 

- The time base facility (TB) for writing. The TB is a 64-bit register pair that 
can be used to provide time of day or interval timing. The TB consists of two 
32-bit registers — ^time base upper (TBU) and time base lower (TBL). 

- Decrementer (DEC). The DEC register is a 32-bit decrementing counter that 
provides a mechanism for causing a decrementer exception after a 
programmable delay. The DEC is decremented once every four bus clock 
cycles. 

- External access register (EAR). The EAR is a 32-bit register used in 
conjunction with the eciwx and ecowx instructions. While the PowerPC 
architecture specifies that the low-order six bits of the EAR (bits 26-31) are 
used to select a device, the 603e only implements the low-order 4 bits (bits 
28-31). Note that the EAR register and the eciwx and ecowx instructions are 
optional in the PowerPC architecture and may not be supported in all 
PowerPC processors that implement the OEA. 
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2.1.2 PowerPC 603e-Specific Registers 

The 603e includes several implementation-specific SPRs that are not defined by the 
PowerPC architecture. They are the DMISS, IMISS, DCMP, ICMP, HASHl, HASH2, 
RPA, HIDO, HIDl, and lABR registers. These registers can be accessed by supervisor-level 
instructions only. Any attempt to access these SPRs with user-level instructions results in a 
supervisor-level exception. The SPR numbers for these registers are shown in Figure 2-1. 

The DMISS, IMISS, DCMP, ICMP, HASHl, HASH2, and RPA registers are used for 
software table search operations and should only be accessed when address translation is 
disabled (that is, MSR[IR] = 0 and MSR[DR] = 0). For a complete discussion of software 
table search operations, refer to Section 5.5.2, “Table Search Operation with the PowerPC 
603e Microprocessor.” 

2.1. 2.1 Hardware Implementation Registers (HIDO and HIDl) 

The HIDO and HIDl registers, shown in Figure 2-2 and Figure 2-3 respectively, define 
enable bits for various 603e-specific features. 
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Figure 2-2. Hardware Implementation Register 0 (HIDO) 

Table 2-2 shows the bit definitions for HIDO. 



30 31 



Table 2-2. HIDO Bit Settings 



Bit(s) 


Name 


Description 


0 


EMCP 


Enable machine check pin 


1 


— 


Reserved 


2 


EBA 


Enable bus address parity checking 


3 


EBD 


Enable bus data parity checking 


4 


SBCLK 


Select bus clock for test clock pin 


5 


EICE 


Enable ICE outputs—pipellne tracking support 


6 


ECLK 


Enable external test clock pin 


7 


PAR 


Disable precharge of ARTRY and shared signals 


8 


DOZE 


Doze mode — PLL, time base, and snooping active ^ 


9 


NAP 


Nap mode — PLL and time base active ^ 


10 


SLEEP 


Sleep mode — no external clock required ^ 
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Table 2-2. HIDO Bit Settings (Continued) 



Bit(s) 


Name 


Description 


11 


DPM 


Enable dynamic power management ^ 


12 


RISEG 


Reserved for test 


13-14 


— 


Reserved 


15 


NHR 


Reserved 


16 


ICE 


Instruction cache enable^ 


17 


DCE 


Data cache enable^ 


18 


ILOCK 


Instruction cache LOCK^ 


19 


□LOCK 


Data cache LOCK^ 


20 


ICFI 


Instruction cache flash invalidate^ 


21 


DCFI 


Data cache flash invalidate^ 


22-26 


— 


Reserved 


27 


FBIOB 


Force branch indirect on bus 


28-30 


— 


Reserved 


31 


NOOPTI 


No-op touch instructions 



Notes: 

1 . See Chapter 9, “Power Management,” for more information. 

2. See Chapter 3, “Instruction and Data Cache Operation,” for more information. 
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Figure 2-3. Hardware Implementation Register 1 (MIDI) 



Table 2-3 shows the bit definitions for HIDl . 

Table 2-3. MIDI Bit Settings 



Bit(s) 


Name 


Description 


0 


PCO 


PLL configuration bit 0 (read-only) 


1 


PCI 


PLL configuration bit 1 (read-only) 


2 


PC2 


PLL configuration bit 2 (read-only) 


3 


PC3 


PLL configuration bit 3 (read-only) 


4-31 


— 


Reserved 



Note: The clock configuration bits reflect the state of the PLL_CFGQ-PLL_CFG3 signals. 
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2.1 .2.2 Data and Instruction TLB Miss Address Registers 
(DMISS and IMISS) 

The DMISS and IMISS registers have the same format as shown in Figure 2-4. They are 
loaded automatically upon a data or instruction TLB miss. The DMISS and IMISS contain 
the effective page address of the access that caused the TLB miss exception. The contents 
are used by the 603e when calculating the values of HASH 1 and HASH2, and by the tlbld 
and tlbli instructions when loading a new TLB entry. Note that the 603e always loads the 
DMISS register with a big-endian address, even when MSR[LE] is set. These registers are 
read and write to the software. 



Effective Page Address 

0 31 



Figure 2-4. DMISS and IMISS Registers 

2.1. 2.3 Data and Instruction TLB Compare Registers 
(DCMP and ICMP) 

The DCMP and ICMP registers are shown in Figure 2-5. These registers contain the first 
word in the required PTE. The contents are constructed automatically from the contents of 
the segment registers and the effective address (DMISS or IMISS) when a TLB miss 
exception occurs. Each PTE read from the tables during the table search process should be 
compared with this value to determine whether or not the PTE is a match. Upon execution 
of a tlbld or tlbli instruction the upper 25 bits of the DCMP or ICMP register and 11 bits 
of the effective address operand are loaded into the first word of the selected TLB entry. 
These registers are read and write to the software. 
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Figure 2-5. DCMP and ICMP Registers 

Table 2-4 describes the bit settings for the DCMP and ICMP registers. 

Table 2-4. DCMP and ICMP Bit Settings 



Bits 


Name 


Description 


0 


V 


Valid bit. Set by the processor on a TLB miss exception. 


1-24 


VSID 


Virtual segment ID. Copied from VSID field of corresponding 
segment register. 


25 


— 


Reserved 


26-31 


API 


Abbreviated page index. Copied from API of effective address. 
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2.1.2.4 Primary and Secondary Hash Address Registers 
(HASH1 andHASH2) 

The HASH 1 and HASH2 registers contain the physical addresses of the primary and 
secondary PTEGs for the access that caused the TLB miss exception. For convenience, the 
603e automatically constructs the full physical address by routing bits 0~6 of SDRl into 
HASHl and HASH2 and clearing the lower 6 bits. These registers are read-only and are 
constructed from the contents of the DMISS or IMISS register (the register choice is 
determined by which miss was last acknowledged). The format for the HASHl and HASH2 
registers is shown in Figure 2-6. 



HTABORG[0-6] 



Hashed Page Address 



000000 



0 



6 7 



25 26 



31 



Figure 2-6. HASHl and HASH2 Registers 

Table 2-5 describes the bit settings of the HASHl and HASH2 registers. 

Tabie 2-5. HASHl and HASH2 Bit Settings 



Bits 


Name 


Description 


0-6 


HTABORG[0-6] 


Copy of the upper 7 bits of the HTABORG field from SDRl 


7-25 


Hashed page address 


Address bits 7-25 of the PTEG to be searched 


26-31 


— 


Reserved 



2.1 .2.5 Required Physical Address Register (RPA) 

The RPA register is shown in Figure 2-7. During a page table search operation, the software 
must load the RPA with the second word of the correct PTE. When the tlbld or tlbli 
instruction is executed, the contents of the RPA register and the DMISS or IMISS register 
are merged and loaded into the selected TLB entry. The referenced (R) bit is ignored when 
the write occurs (no location exists in the TLB entry for this bit). The RPA register is read 
and write to the software. 



in Reserved 



RPN 


000 


B 


B 


WIMQ 


a J 


1 ^^1 



0 19 20 22 23 24 25 28 29 30 31 



Figure 2-7. Required Physical Address Register (RPA) 
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Table 2-6 describes the bit settings of the RPA register. 



Table 2-6. RPA Bit Settings 



Bits 


Name 


Description 


0-19 


RPN 


Physical page number from PTE 


20-22 


— 


Reserved 


23 


R 


Referenced bit from PTE 


24 


C 


Changed bit from PTE 


25-28 


WIMG 


Memory/cache access attribute bits 


29 


— 


Reserved 


30-31 


PP 


Page protection bits from PTE 



2.1. 2.6 Instruction Address Breakpoint Register (lABR) 

The lABR, shown in Figure 2-8, controls the instruction address breakpoint exception. 
IABR[CEA] holds an effective address to which each instruction is compared. The 
exception is enabled by setting bit 30 of lABR. The exception is taken when there is an 
instruction address breakpoint match on the next instruction to complete. The instruction 
tagged with the match will not be completed before the breakpoint exception is taken. 



B Reserved 



CEA 




0 



29 30 31 



Figure 2-8. Instruction Address Breakpoint Register (lABR) 



The bits in the lABR are defined as shown in Table 2-7. 

Table 2-7. instruction Address Breakpoint Register Bit Settings 



Bit 


Description 


0-29 


Word address to be compared 


30 


lABR enabled. Setting this bit indicates that the lABR exception is enabled. 


31 


Reserved 



2.2 Operand Conventions 

This section describes the operand conventions as they are represented in two levels of the 
PowerPC architecture. It also provides detailed descriptions of conventions used for storing 
values in registers and memory, accessing the 603e’s registers, and representation of data 
in these registers. 
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2.2.1 Floating-Point Execution Models — UlSA 

The IEEE 754 standard includes 64- and 32-bit arithmetic. The standard requires that 
single-precision arithmetic be provided for single-precision operands. The standard permits 
double-precision arithmetic instructions to have either (or both) single-precision or double- 
precision operands, but states that single-precision arithmetic instructions should not 
accept double-precision operands. 

The PowerPC UISA follows these guidelines: 

• Double-precision arithmetic instructions may have single-precision operands but 
always produce double-precision results. 

• Single-precision arithmetic instructions require all operands to be single-precision 
and always produce single-precision results. 

For arithmetic instructions, conversions from double- to single-precision must be done 
explicitly by software, while conversions from single- to double-precision are done 
implicitly. 

All PowerPC implementations provide the equivalent of the following execution models to 
ensure that identical results are obtained. The definition of the arithmetic instructions for 
infinities, denormalized numbers, and NaNs follow conventions described in the following 
sections. 

Although the double-precision format specifies an 11 -bit exponent, exponent arithmetic 
uses two additional bit positions to avoid potential transient overflow conditions. An extra 
bit is required when denormalized double-precision numbers are prenormalized. A second 
bit is required to permit computation of the adjusted exponent value in the following 
examples when the corresponding exception enable bit is 1: 

• Underflow during multiplication using a denormalized factor 

• Overflow during division using a denormalized divisor 

2.2.2 Data Organization in Memory and Data Transfers 

Bytes in memory are numbered consecutively starting with 0. Each number is the address 
of the corresponding byte. 

Memory operands may be bytes, half words, words, or double words, or, for the load/store 
multiple and move assist instructions, a sequence of bytes or words. The address of a 
memory operand is the address of its first byte (that is, of its lowest-numbered byte). 
Operand length is implicit for each instruction. 

2.2.3 Alignment and Misaligned Accesses 

The operand of a single-register memory access instruction has a natural alignment 
boundary equal to the operand length. In other words, the “natural” address of an operand 
is an integral multiple of the operand length. A memory operand is said to be aligned if it 
is aligned at its natural boundary; otherwise it is misaligned. 
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Operands for single-register memory access instructions have the characteristics shown in 
Table 2-8. (Although not permitted as memory operands, quad words are shown because 
quad-word alignment is desirable for certain memory operands.) 

Table 2-8. Memory Operands 



Operand 


Length 


Addr[28-31] 
If Aligned 


Byte 


8 bits 


xxxx 


Half word 


2 bytes 


xxxO 


Word 


4 bytes 


xxOO 


Double word 


8 bytes 


xOOO 


Quad word 


16 bytes 


0000 



Note: An “x” in an address bit position indicates that the bit can 
be 0 or 1 independent of the state of other bits in the 
address. 



The concept of alignment is also applied more generally to data in memory. For example, 
a 12-byte data item is said to be word-aligned if its address is a multiple of four. 

Implementation Notes — ^The following describes how the 603e handles alignment and 
misaligned accesses: 

• The 603e provides hardware support for some misaligned memory accesses. 
However, misaligned accesses will suffer a performance degradation compared to 
aligned accesses of the same type. 

• The 603e does not provide hardware support for floating-point load/store operations 
that are not word-aligned. In such a case, the 603e will invoke an alignment 
exception and the exception handler must break up the misaligned access. For this 
reason, floating-point single- and double-word accesses should always be word- 
aligned. Note that a floating-point double-word access on a word-aligned boundary 
requires an extra cycle to complete. 

Any memory access that crosses an alignment boundary must be broken into multiple 
discrete accesses. This includes half-word, word, double-word, and string references. For 
the case of string accesses, the hardware makes no attempt to get aligned in an effort to 
reduce the number of discrete accesses. (Multiword accesses are architecturally required to 
be aligned.) The resulting performance degradation depends upon how well each individual 
access behaves with respect to the memory hierarchy. At a minimum, additional cache 
access cycles are required. More dramatically, for the case of access to a noncacheable 
page, each discrete access involves an individual bus operation which will reduce the 
effective bandwidth of the bus. 

The frequent use of misaligned accesses is discouraged since they can compromise the 
overall performance of the processor. 
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2.2.4 Floating-Point Operand 

The 603e provides hardware support for all single- and double-precision floating-point 
operations for most value representations and alt rounding modes. The PowerPC 
architecture provides for hardware to implement a floating-point system as defined in 
ANSI/IEEE standard 754-1985, IEEE Standard for Binary Floating Point Arithmetic. For 
detailed information about the floating-point execution model refer to Chapter 3, “Operand 
Conventions,” in The Programming Environments Manual. 

2.2.5 Effect of Operand Placement on Performance 

The VEA states that the placement (location and alignment) of operands in memory affect 
the relative performance of memory accesses. The best performance is guaranteed if 
memory operands are aligned on natural boundaries. To obtain the best performance from 
the 603e, the programmer should assume the performance model described in Chapter 3, 
“Operand Conventions,” in The Programming Environments Manual. 

2.3 Instruction Set Summary 

This section describes instructions and addressing modes defined for the 603e. These 
instructions are divided into the following functional categories: 

• Integer instructions — These include arithmetic and logical instructions. For more 
information, see Section 2.3.4. 1, “Integer Instructions.” 

• Floating-point instructions — These include floating-point arithmetic instructions, as 
well as instructions that affect the floating-point status and control register (FPSCR). 
For more information, see Section 2.3.4.2, “Floating-Point Instructions.” 

• Load and store instructions — ^These include integer and floating-point load and store 
instructions. For more information, see Section 2.3.4.3, “Load and Store 
Instructions.” 

• Flow control instructions — These include branching instructions, condition register 
logical instructions^ and other instructions that affect the instruction flow. For more 
information, see Section 2.3. 4.4, “Branch and Flow Control Instructions.” 

• Trap instructions — These instructions are used to test for a specified set of 
conditions; see Section 2.3.4.5, “Trap Instructions,” for more information. 

• Processor control instructions — These instructions are used for synchronizing 
memory accesses and managing caches, TLBs, and segment registers. For more 
information, see Sections 2.3.4.6, 2.3.5. 1, and 2.3.6.2. 

• Memory synchronization instructions — ^These instructions are used for memory 
synchronizing. See Sections 2.3.4 .7 and Section 2.3.5. 2 for more information. 

• Memory control instructions— These instructions provide control of caches, TLBs, 
and segment registers. For more information, see Sections 2.3.5.3 and 2.3.6.3. 
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• System linkage instructions — ^For more information, see Section 2.3.6. 1, “System 
Linkage Instructions.” 

• External control instructions — ^These include instructions for use with special input/ 
output devices. For more information, see Section 2.3. 5.4, “External Control 
Instructions.” 

Note that this grouping of instructions does not necessarily indicate the execution unit that 
processes a particular instruction or group of instructions. This information, which is useful 
in taking full advantage of the 603e’s superscalar parallel instruction execution, is provided 
in Chapter 8, “Instruction Set,” in The Programming Environments Manual, 

Integer instructions operate on word operands. Floating-point instructions operate on 
single-precision and double-precision floating-point operands. The PowerPC architecture 
uses instructions that are four bytes long and word-aligned. It provides for byte, half-word, 
and word operand loads and stores between memory and a set of 32 general-purpose 
registers (GPRs). It also provides for word and double-word operand loads and stores 
between memory and a set of 32 floating-point registers (FPRs). 

Arithmetic and logical instructions do not read or modify memory. To use the contents of a 
memory location in a computation and then modify the same or another memory location, 
the memory contents must be loaded into a register, modified, and then written to the target 
location using load and store instructions. 

The description of each instruction includes the mnemonic and a formatted list of operands. 
To simplify assembly language programming, a set of simplified mnemonics (extended 
mnemonics in the architecture specification) and symbols is provided for some of the 
frequently-used instructions; see Appendix F, “Simplified Mnemonics,” in The 
Programming Environments Manual for a complete list of simplified mnemonic examples. 

2.3.1 Classes of Instructions 

The 603e instructions belong to one of the following three classes: 

• Defined 

• Illegal 

• Reserved 

Note that while the definitions of these terms are consistent among the PowerPC 
processors, the assignment of these classifications is not. For example, an instruction that 
is specific to 64-bit implementations is considered defined for 64-bit implementations but 
illegal for 32-bit implementations such as the 603e. 

The class is determined by examining the primary opcode and the extended opcode, if any. 
If the opcode, or combination of opcode and extended opcode, is not that of a defined 
instruction or of a reserved instruction, the instruction is illegal. 



Chapter 2. PowerPC 603e Microprocessor Programming Model 



2-15 




In future versions of the PowerPC architecture, instruction codings that are now illegal may 
become assigned to instructions in the architecture, or may be reserved by being assigned 
to processor-specific instructions. 

2.3.1. 1 Definition of Boundedly Undefined 

If instructions are encoded with incorrectly set bits in reserved fields, the results on 
execution can be said to be boundedly undefined. If a user-level program executes the 
incorrectly coded instruction, the resulting undefined results are bounded in that a spurious 
change from user to supervisor state is not allowed, and the level of privilege exercised by 
the program in relation to memory access and other system resources cannot be exceeded. 
Boundedly undefined results for a given instruction may vary between implementations, 
and between execution attempts in the same implementation. 

2.3.1. 2 Defined Instruction Class 

Defined instructions are guaranteed to be supported in all PowerPC implementations, 
except as stated in the instruction descriptions in Chapter 8, “Instruction Set,’’ in The 
Programming Environments Manual. The 603e provides hardware support for all 
instructions defined for 32-bit implementations. 

A PowerPC processor invokes the illegal instruction error handler (part of the program 
exception) when the unimplemented PowerPC instructions are encountered so they may be 
emulated in software, as required. 

A defined instruction can have invalid forms, as described in the following subsection. 

2.3.1 .3 Illegal Instruction Class 

Illegal instructions can be grouped into the following categories: 

• Instructions that are not implemented in the PowerPC architecture. These opcodes 
are available for future extensions of the PowerPC architecture; that is, future 
versions of the PowerPC architecture may define any of these instructions to 
perform new functions. 

The following primary opcodes are defined as illegal but may be used in future 
extensions to the architecture: 

1,4, 5, 6, 9, 22, 56, 57, 60,61 

• Instructions that are implemented in the PowerPC architecture but are not 
implemented in a specific PowerPC implementation. For example, instructions that 
can be executed on 64-bit PowerPC processors are considered illegal by 32-bit 
processors. 

The following primary opcodes are defined for 64-bit implementations only and are 
illegal on the 603e: 

2, 30, 58, 62 
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• All unused extended opcodes are illegal. The unused extended opcodes can be 
determined from information in Section A.2, “Instructions Sorted by Opcode,” and 
Section 2.3. 1.4, “Reserved Instruction Class.” Notice that extended opcodes for 
instructions that are defined only for 64-bit implementations are illegal in 32-bit 
implementations, and vice versa. 

The following primary opcodes have unused extended opcodes. 

17, 19, 31, 59, 63 (primary opcodes 30 and 62 are illegal for all 32-bit 
implementations, but as 64-bit opcodes they have some unused extended opcodes) 

• An instruction consisting entirely of zeros is guaranteed to be an illegal instruction. 
This increases the probability that an attempt to execute data or uninitialized 
memory invokes the system illegal instruction error handler (a program exception). 
Note that if only the primary opcode consists of all zeros, the instruction is 
considered a reserved instruction. This is further described in Section 2.3. 1.4, 
“Reserved Instruction Class.” 

An attempt to execute an illegal instruction invokes the illegal instruction error handler (a 
program exception) but has no other effect. See Section 4.5.7, “Program Exception 
(0x00700),” for additional information about illegal and invalid instruction exceptions. 

With the exception of the instruction consisting entirely of binary zeros, the illegal 
instructions are available for further additions to the PowerPC architecture. 

2.3.1 .4 Reserved Instruction Class 

Reserved instructions are allocated to specific implementation-dependent purposes not 
defined by the PowerPC architecture. An attempt to execute an unimplemented reserved 
instruction invokes the illegal instruction error handler (a program exception). See 
Section 4.5.7, “Program Exception (0x00700),” for additional information about illegal 
and invalid instruction exceptions. 

The following types of instructions are included in this class: 

• Implementation-specific instructions (for example. Load Data TLB Entry (tlbld) 
and Load Instruction TLB Entry (tlbli) instructions) 

• Optional instructions defined by the PowerPC architecture but not implemented by 
the 603e (for example. Floating Square Root (fsqrt) and Floating Square Root 
Single (fsqrts) instructions) 

2.3.2 Addressing Modes 

This section provides an overview of conventions for addressing memory and for 
calculating effective addresses as defined by the PowerPC architecture for 32-bit 
implementations. For more detailed information, see “Conventions,” in Chapter 4, 
“Addressing Modes and Instruction Set Summary,” of The Programming Environments 
Manual. 
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2.3.2.1 Memory Addressing 

A program references memory using the effective (logical) address computed by the 
processor when it executes a memory access or branch instruction or when it fetches the 
next sequential instruction. 

2.5.2.2 Memory Operands 

Bytes, in memory are numbered consecutively starting with zero. Each number is the 
address of the corresponding byte. 

Memory operands may be bytes, half words, words, or double words, or, for the load/store 
multiple and load/store string instructions, a sequence of bytes or words. The address of a 
memory operand is the address of its first byte (that is, of its lowest-numbered byte). 
Operand length is implicit for each instruction. The PowerPC architecture supports both 
big-endian and little-endian byte ordering. The default byte and bit ordering is big-endian. 
See “Byte Ordering” in Chapter 3, “Operand Conventions,” in The Programming 
Environments Manual for more information about big-endian and little-endian byte 
ordering. 

The operand of a single-register memory access instruction has a natural alignment 
boundary equal to the operand length. In other words, the “natural” address of an operand 
is an integral multiple of the operand length. A memory operand is said to be aligned if it 
is aligned at its natural boundary; otherwise it is misaligned. For a detailed discussion about 
memory operands, see Chapter 3, “Operand Conventions,” in The Programming 
Environments Manual, 

2.5.2.3 Effective Address Calculation 

An effective address (EA) is the 32-bit sum computed by the processor when executing a 
memory access or branch instruction or when fetching the next sequential instruction. For 
a memory access instruction, if the sum of the effective address and the operand length 
exceeds the maximum effective address, the memory operand is considered to wrap around 
from the maximum effective address through effective address 0, as described in the 
following paragraphs. 

Effective address computations for both data and instruction accesses use 32-bit unsigned 
binary arithmetic. A carry from bit 0 is ignored. 

Load and store operations have three categories of effective address generation: 

• Register indirect with immediate index mode 

• Register indirect with index mode 

• Register indirect mode 

Refer to Section 2.3. 4.3. 2, “Integer Load and Store Address Generation,” for further 
discussion of effective address generation for load and store operations. 
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Branch instructions have three categories of effective address generation: 

• Immediate 

• Link register indirect 

• Count register indirect 

Refer to Section 2 . 3 . 4 . 4 . 1 , “Branch Instruction Address Calculation,” for further discussion 
of branch instruction effective address generation. 

2.S.2.4 Synchronization 

The sychronization described in this section refers to the state of the processor that is 
performing the sychronization. 

2.3.2.4.1 Context Synchronization 

The System Call (sc) and Return from Interrupt (rfl) instructions perform context 
synchronization by allowing previously issued instructions to complete before performing 
a change in context. Execution of one of these instructions ensures the following: 

• No higher priority exception exists (sc). 

• All previous instructions have completed to a point where they can no longer cause 
an exception. If a prior memory access instruction causes direct-store error 
exceptions, the results are guaranteed to be determined before this instruction is 
executed. 

• Previous instructions complete execution in the context (privilege, protection, and 
address translation) under which they were issued. 

• The instructions following the sc or rfi instruction execute in the context established 
by these instructions. 

2.3.2.4.2 Execution Synchronization 

An instruction is execution synchronizing if all previously initiated instructions appear to 
have completed before the instruction is initiated or, in the case of the Synchronize (sync) 
and Instruction Synchronize (isync) instructions, before the instruction completes. For 
example, the Move to Machine State Register (mtmsr) instruction is execution 
synchronizing. It ensures that all preceding instructions have completed execution and will 
not cause an exception before the instruction executes, but does not ensure subsequent 
instructions execute in the newly established environment. For example, if the mtmsr sets 
the MSR[PR] bit, unless an isync immediately follows the mtmsr instruction, a privileged 
instruction could be executed or privileged access could be performed without causing an 
exception even though the MSR[PR] bit indicates user mode. 

2.3.2.4.3 Instruction-Related Exceptions 

There are two kinds of exceptions in the 603e — those caused directly by the execution of 
an instruction and those caused by an asynchronous event. Either may cause components 
of the system software to be invoked. 
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Exceptions can be caused directly by the execution of an instruction as follows: 

• An attempt to execute an illegal instruction causes the illegal instruction (program 
exception) handler to be invoked. An attempt by a user-level program to execute the 
supervisor-level instructions listed below causes the privileged instruction (program 
exception) handler to be invoked. The 603e provides the following supervisor-level 
instructions: dcbi, mfmsr, mfspr, mfsr, mfsrin, mtmsr, mtspr, mtsr, mtsrin, rfi, 
tlbie, tlbsync, tlbld, and tlbli. Note that the privilege level of the mfspr and mtspr 
instructions depends on the SPR encoding. 

• An attempt to access memory that is not available (page fault) causes the ISI 
exception handler to be invoked. 

• An attempt to access memory with an effective address alignment that is invalid for 
the instruction causes the alignment exception handler to be invoked. 

• The execution of an sc instruction invokes the system call exception handler that 
permits a program to request the system to perform a service. 

• The execution of a trap instruction invokes the program exception trap handler. 

• The execution of a floating-point instruction when floating-point instructions are 
disabled invokes the floating-point unavailable handler. 

• The execution of an instruction that causes a floating-point exception while 
exceptions are enabled in the MSR invokes the program exception handler. 

Exceptions caused by asynchronous events are described in Chapter 4, “Exceptions.” 

2.3.3 Instruction Set Overview 

This section provides a brief overview of the PowerPC instructions implemented in the 
603e and highlights any special information with respect to how the 603e implements a 
particular instruction. Note that the categories used in this section correspond to those used 
in Chapter 4, “Addressing Modes and Instruction Set Summary,” in The Programming 
Environments Manual. These categorizations are somewhat arbitrary and are provided for 
the convenience of the programmer and do not necessarily reflect the PowerPC architecture 
specification. 

Note that some of the instructions have the following optional features: 

• CR Update — ^The dot (•) suffix on the mnemonic enables the update of the CR. 

• Overflow option — ^The o suffix indicates that the overflow bit in the XER is enabled. 

2.3.4 PowerPC UlSA Instructions 

The PowerPC UISA includes the base user-level instruction set (excluding a few user-level 
cache control, synchronization, and time base instructions), user-level registers, 
programming model, data types, and addressing modes. This section discusses the 
instructions defined in the UISA. 
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2.3.4.1 Integer Instructions 

This section describes the integer instructions. These consist of the following: 

• Integer arithmetic instructions 

• Integer compare instructions 

• Integer logical instructions 

• Integer rotate and shift instructions 

Integer instructions use the content of the GPRs as source operands and place results into 
GPRs, into the XER, and into condition register (CR) fields. 

2.3. 4.1 .1 Integer Arithmetic Instructions 

Table 2-9 lists the integer arithmetic instructions for the 603e. 



Tabie 2-9. Integer Arithmetic Instructions 



Name 


Mnemonic 


Operand Syntax 


Add Immediate 


addi 


rD,rA,SIMM 


Add Immediate Shifted 


addis 


rD,rA,SIMM 


Add 


add (add. addo addo.) 


rD,rA,rB 


Subtract From 


subf (subf. subfo subfo.) 


rD,rA,rB 


Add Immediate Carrying 


addic 


rD,rA,SIMM 


Add Immediate Carrying and Record 


addic. 


rD,rA,SIMM 


Subtract from Immediate Carrying 


subtle 


rD,rA,SIMM 


Add Carrying 


addc (adde. addeo addeo.) 


rD,rA,rB 


Subtract from Carrying 


subfc (subfe. subfeo subfeo.) 


rD,rA,rB 


Add Extended 


adde (adde. addeo addeo.) 


rD,rA,rB 


Subtract from Extended 


subfe (subfe. subfeo subfeo.) 


rD,rA,rB 


Add to Minus One Extended 


addme (addme. addmeo addmeo.) 


rD,rA 


Subtract from Minus One Extended 


subfme (subfme. subfmeo subfmeo.) 


rD,rA 


Add to Zero Extended 


addze (addze. addzeo addzeo.) 


rD,rA 


Subtract from Zero Extended 


subfze (subfze. subfzeo subfzeo.) 


rD,rA 


Negate 


neg (neg. nego nego.) 


rD,rA 


Multiply Low Immediate 


mulli 


rD,rA,SIMM 


Multiply Low 


mullw (mullw. muliwo mullwo.) 


rD,rA,rB 


Multiply High Word 


mulhw (mulhw.) 


rD,rA,rB 


Multiply High Word Unsigned 


mulhwu (mulhwu.) 


rD,rA,rB 


Divide Word 


divw(divw. divwo divwo.) 


rD,rA,rB 


Divide Word Unsigned 


divwu (divwu. divwuo divwuo.) 


rD,rA,rB 









































































Although there is no Subtract Immediate instruction, its effect can be achieved by using an 
addi instruction with the immediate operand negated. Simplified mnemonics are provided 
that include this negation. The subf instructions subtract the second operand (rA) from the 
third operand (rB). Simplified mnemonics are provided in which the third operand is 
subtracted from the second operand. See Appendix F, “Simplified Mnemonics,” in The 
Programming Environments Manual fox 

2.3.4.1.2 Integer Compare Instructions 

The integer compare instructions algebraically or logically compare the contents of rA with 
either the UIMM operand, the SIMM operand, or the contents of rB. The comparison is 
signed for the cmpi and cmp instructions, and unsigned for the cmpli and cmpl 
instructions. Table 2-10 lists the integer compare instructions. 



Table 2-10. Integer Compare Instructions 



Name 


Mnemonic 


Operand Syntax 


Compare Immediate 


cmpi 


crfD, L,rA, SIMM 


Compare 


cmp 


crfD,L,rA,rB 


Compare Logical Immediate 


cmpli 


crfD, L,rA, UIMM 


Compare Logical 


cmpi 


crfD,L,rA,rB 



The crfD operand can be omitted if the result of the comparison is to be placed in CRO. 
Otherwise the target CR field must be specified in the instruction crfD field. 

For more information refer to Appendix F, “Simplified Mnemonics,” in The Programming 
Environments Manual, 

2.3.4.1.3 Integer Logical Instructions 

The logical instructions shown in Table 2-11 perform bit-parallel operations. Logical 
instructions with the CR update enabled and instructions andi. and andis. set CR field CRO 
to characterize the result of the logical operation. These fields are set as if the sign-extended 
low-order 32 bits of the result were algebraically compared to zero. Logical instructions 
without CR update and the remaining logical instructions do not modify the CR. Logical 
instructions do not affect the XER[SO], XER[OV], and XER[CA] bits. 

For simplified mnemonics examples for the integer logical operations see Appendix F, 
“Simplified Mnemonics,” in The Programming Environments Manual, 



Table 2-11. Integer Logical Instructions 



Name 


Mnemonic 


Operand Syntax 


AND Immediate 


andi. 


rA,rS,UIMM 


AND Immediate Shifted 


andis. 


rA,rS,UIMM 


OR Immediate 


ori 


rA,rS,UIMM 
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Table 2-11. Integer Logical Instructions (Continued) 



Name 


Mnemonic 


Operand Syntax 


OR Immediate Shifted 


oris 


rA,rS,UIMM 


XOR Immediate 


xori 


rA,rS,UIMM 


XOR Immediate Shifted 


xoris 


rA,rS,UIMM 


AND 


and (and.) 


rA,rS,rB 


OR 


or (or.) 


rA,rS,rB 


XOR 


xor (xor.) 


rA,rS,rB 


NAND 


nand (nand.) 


rA,rS,rB 


NOR 


nor (nor.) 


rA,rS,rB 


Equivalent 


eqv (eqv.) 


rA,rS,rB 


AND with Complement 


andc (andc.) 


rA,rS,rB 


OR with Complement 


ore (ore.) 


rA,rS,rB 


Extend Sign Byte 


extsb (extsb.) 


rA,rS 


Extend Sign Half Word 


extsh (extsh.) 


rA,rS 


Count Leading Zeros Word 


entizw (entizw.) 


rA.rS 



2.3.4.1.4 Integer Rotate and Shift Instructions 

Rotation operations are performed on data from a GPR, and the result, or a portion of the 
result, is returned to a GPR. See Appendix F, “Simplified Mnemonics,” in The 
Programming Environments Manual for a complete list of simplified mnemonics that 
allows simpler coding of often-used functions such as clearing the leftmost or rightmost 
bits of a register, left justifying or right justifying an arbitrary field, and simple rotates and 
shifts. 

Integer rotate instructions rotate the contents of a register. The result of the rotation is either 
inserted into the target register under control of a mask (if a mask bit is 1 the associated bit 
of the rotated data is placed into the target register, and if the mask bit is 0 the associated 
bit in the target register is unchanged), or ANDed with a mask before being placed into the 
target register. 

The integer rotate instructions are listed in Table 2-12. 

Table 2-12. Integer Rotate Instructions 



Name 


Mnemonic 


Operand Syntax 


Rotate Left Word immediate then AND with Mask 


rlwinm (rlwinm.) 


rA,rS,SH,MB,ME 


Rotate Left Word then AND with Mask 


rlwnm (rlwnm.) 


rA,rS,rB,MB,ME 


Rotate Left Word Immediate then Mask Insert 


rlwimi (riwimi.) 


rA,rS,SH,MB,ME 
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The integer shift instructions perform left and right shifts. Immediate-form logical 
(unsigned) shift operations are obtained by specifying masks and shift values for certain 
rotate instructions. Simplified mnemonics are provided to make coding of such shifts 
simpler and easier to understand. 

Multiple-precision shifts can be programmed as shown in Appendix C, “Multiple-Precision 
Shifts,” in The Programming Environments Manual. 

The integer shift instructions are listed in Table 2-13. 

Table 2-13. Integer Shift Instructions 



Name 


Mnemonic 


Operand Syntax 


Shift Left Word 


slw(slw.) 


rA,rS,rB 


Shift Right Word 


srw (srw.) 


rA,rS,rB 


Shift Right Algebraic Word Immediate 


srawi (srawi.) 


rA,rS,SH 


Shift Right Algebraic Word 


sraw (sraw.) 


rA,rS,rB 



2.S.4.2 Floating-Point Instructions 

This section describes the floating-point instructions, which include the following: 

• Floating-point arithmetic instructions 

• Floating-point multiply-add instructions 

• Floating-point rounding and conversion instructions 

• Floating-point compare instructions 

• Floating-point status and control register instructions 

• Floating-point move instructions 

See Section 2.3. 4.3, “Load and Store Instructions,” for information about floating-point 
loads and stores. 

The PowerPC architecture supports a floating-point system as defined in the IEEE 754 
standard, but requires software support to conform with that standard. All floating-point 
operations conform to the IEEE 754 standard, except if software sets the non-IEEE mode 
bit (NI) in the FPSCR; the 603e is in the nondenormalized mode when the NI bit is set in 
the FPSCR. If a denormalized result is produced, a default result of zero is generated. The 
generated zero has the same sign as the denormalized number. The 603e performs single- 
and double-precision floating-point operations compliant with the IEEE-754 floating-point 
standard. 

Implementation Note — Single-precision denormalized results require two additional 
processor clock cycles to round. When loading or storing a single-precision denormalized 
number, the load/store unit may take up to 24 processor clock cycles to convert between the 
internal double-precision format and the external single-precision format. 
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2.3.4.2.1 Floating-Point Arithmetic Instructions 

The floating-point arithmetic instructions are listed in Table 2-14. 



Table 2-14. Floating-Point Arithmetic Instructions 



Name 


Mnemonic 


Operand Syntax 


Floating Add (Double-Precision) 


fadd (fadd.) 


frD,frA,frB 


Floating Add Single 


fadds (fadds.) 


frD,frA,frB 


Floating Subtract (Double-Precision) 


fsub (fsub.) 


frD,frA,frB 


Floating Subtract Single 


fsubs (fsubs.) 


frD,frA,frB 


Floating Multiply (Double-Precision) 


fmui (fmul.) 


frD,frA,frC 


Floating Multiply Single 


fmuls (fmuls.) 


frD,frA,frC 


Floating Divide (Double-Precision) 


fdiv (fdiv.) 


frD,frA,frB 


Floating Divide Single 


fdivs (fdivs.) 


frD,frA,frB 


Floating Reciprocal Estimate Single 


fres (fres.) 


frD,frB 


Floating Reciprocal Square Root Estimate 


frsqrte (frsqrte.) 


frD,frB 


Floating Select 


fsei (fsel.) 


frD,frA,frC,frB 



2.S.4.2.2 Floating-Point Multiply-Add Instructions 

These instructions combine multiply and add operations without an intermediate rounding 
operation. The fractional part of the intermediate product is 106 bits wide, and all 106 bits 
take part in the add/subtract portion of the instruction. 

The floating-point multiply-add instructions are listed in Table 2-15. 



Table 2-15. Floating-Point Muitiply-Add Instructions 



Name 


Mnemonic 


Operand Syntax 


Floating Multiply-Add (Double-Precision) 


fmadd (fmadd.) 


frD,frA,frC,frB 


Floating Multiply-Add Single 


fmadds (fmadds.) 


frD,frA,frC,frB 


Floating Multiply-Subtract (Double-Precision) 


fmsub (fmsub.) 


frD,frA,frC,frB 


Floating Multiply-Subtract Single 


fmsubs (fmsubs.) 


frD,frA,frC,frB 


Floating Negative Multiply-Add (Double-Precision) 


fnmadd (fnmadd.) 


frD,frA,frC,frB 


Floating Negative Multiply-Add Single 


fnmadds (fnmadds.) 


frD,frA,frC,frB 


Floating Negative Multiply-Subtract (Double- 
Precision) 


fnmsub (fnmsub.) 


frD,frA,frC,frB 


Floating Negative Multiply-Subtract Single 


fnmsubs (fnmsubs). 


frD,frA,frC,frB 
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Implementation Note — Single-precision multiply-type instructions operate faster than 
their double-precision equivalents. See Chapter 6, “Instruction Timing,” for more 
information. 

2.3.4.2.S Floating-Point Rounding and Conversion Instructions 

The Floating Round to Single-Precision (frsp) instruction is used to truncate a 64-bit 
double-precision number to a 32-bit single-precision floating-point number. The floating- 
point conversion instructions convert a 64-bit double-precision floating-point number to a 
32-bit signed integer number. 

The PowerPC architecture defines bits 0-31 of floating-point register frD as undefined 
when executing the Floating Convert to Integer Word (fctiw) and Floating Convert to 
Integer Word with Round toward Zero (fctiwz) instructions. 

Examples of uses of these instructions to perform various conversions can be found in 
Appendix D, “Floating-Point Models,” in The Programming Environments Manual, The 
floating-point rounding instructions are shown in Table 2-16. 



Table 2-16. Floating-Point Rounding and Conversion Instructions 



Name 


Mnemonic 


Operand Syntax 


Floating Round to Single-Precision 


frsp (frsp.) 


frD.frB 


Floating Convert to Integer Word 


fctiw (fctiw.) 


frD,frB 


Floating Convert to Integer Word with Round toward Zero 


fctiwz (fctiwz.) 


f rD,frB 



2.S.4.2.4 Floating-Point Compare Instructions 

Floating-point compare instructions compare the contents of two floating-point registers. 
The comparison ignores the sign of zero (that is +0 = -0). The floating-point compare 
instructions are listed in Table 2-17. 



Tabie 2-17. Fioating-Point Compare Instructions 



Name 


Mnemonic 


Operand Syntax 


Floating Compare Unordered 


fcmpu 


crfD,frA,frB 


Floating Compare Ordered 


tempo 


crfD,frA,frB 



2.3.4.2.S Floating-Point Status and Control Register Instructions 

Every FPSCR instruction appears to synchronize the effects of all floating-point 
instructions executed by a given processor. Executing an FPSCR instruction ensures that 
all floating-point instructions previously initiated by the given processor appear to have 
completed before the FPSCR instruction is initiated and that no subsequent floating-point 
instructions appear to be initiated by the given processor until the FPSCR instruction has 
completed. The FPSCR instructions are listed in Table 2-18. 
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Table 2-18. Floating-Point Status and Controi Register Instructions 



Name 


Mnemonic 


Operand Syntax 


Move from FPSCR 


mffs (mffs.) 


frD 


Move to Condition Register from FPSCR 


mcrfs 


crfD,crfS 


Move to FPSCR Field Immediate 


mtfsfi (mtfsfi.) 


crfD,IMM 


Move to FPSCR Fields 


mtfsf (mtfsf.) 


FM,frB 


Move to FPSCR Bit 0 


mtfsbO (mtfsbO.) 


crbD 


Move to FPSCR Bit 1 


mtfsbl (mtfsbi.) 


crbD 



Implementation Note — The architecture notes that, in some implementations, the Move 
to FPSCR Fields (mtfsfcc) instruction may perform more slowly when only a portion of the 
fields are updated as opposed to all of the fields. This is not the case in the 603e. 

2.3.4.2.6 Floating-Point Move Instructions 

Floating-point move instructions copy data from one fioating-point register to another. The 
floating-point move instructions do not modify the FPSCR. The CR update option in these 
instructions controls the placing of result status into CRl. Floating-point move instructions 
are listed in Table 2-19. 



Tabie 2-19. Floating-Point Move Instructions 



Name 


Mnemonic 


Operand Syntax 


Floating Move Register 


fmr (fmr.) 


frD,frB 


Floating Negate 


fneg (fneg.) 


frD,frB 


Floating Absolute Value 


tabs (tabs.) 


frD,frB 


Floating Negative Absolute Value 


f nabs (fnabs.) 


frD,frB 



2.3.4.S Load and Store Instructions 

Load and store instructions are issued and translated in program order; however, the 
accesses can occur out of order. Synchronizing instructions are provided to enforce strict 
ordering. This section describes the load and store instructions of the 603e, which consist 
of the following: 

• Integer load instructions 

• Integer store instructions 

• Integer load and store with byte-reverse instructions 

• Integer load and store multiple instructions 

• Integer load and store string instructions 

• Floating-point load instructions 

• Floating-point store instructions 
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2.3.4.3.1 Self-Modifying Code 

When a processor modifies a memory location that may be contained in the instruction 
cache, software must ensure that memory updates are visible to the instruction fetching 
mechanism. This can be achieved by the following instruction sequence: 

dcbst lupdate memory 

sync Iwait for update 

icbi Iremove (invalidate) copy in instruction cache 

isync Iremove copy in own instruction buffer 

These operations are required because the data cache is a write-back cache. Since 
instruction fetching bypasses the data cache, changes to items in the data cache may not be 
reflected in memory until the fetch operations complete. 

Special care must be taken to avoid coherency paradoxes in systems that implement unified 
secondary caches, and designers should carefully follow the guidelines for maintaining 
cache coherency that are provided in the VEA, and discussed in Chapter 5, “Cache Model 
and Memory Coherency,” in The Programming Environments Manual. Because the 603e 
does not broadcast the M bit for instruction fetches, external caches are subject to 
coherency paradoxes. 

2.3.4.3.2 Integer Load and Store Address Generation 

Integer load and store operations generate effective addresses using register indirect with 
immediate index mode, register indirect with index mode, or register indirect mode. See 
Section 2.3.2.3, “Effective Address Calculation,” for information about calculating 
effective addresses. Note that the 603e is optimized for load and store operations that are 
aligned on natural boundaries, and operations that are not naturally aligned may suffer 
performance degradation. Refer to Section 4.5.6. 1, “Integer Alignment Exceptions,” for 
additional information about load and store address alignment exceptions. 

2.3.4.3.3 Register Indirect Integer Load Instructions 

For integer load instructions, the byte, half word, word, or double word addressed by the 
EA is loaded into rD. Many integer load instructions have an update form, in which rA is 
updated with the generated effective address. For these forms, the EA is placed into rA and 
the memory element (byte, half word, word, or double word) addressed by EA is loaded 
into rD. 

Implementation Note — In some implementations of the PowerPC architecture, the load 
half word algebraic instructions (lha and lhax) and the load with update (Ibzu, Ibzux, Ihzu, 
Ihzux, lhau, Ihaux, Iwu, and Iwux) instructions may execute with greater latency than 
other types of load instructions. In the 603e, these instructions operate with the same 
latency as other load instructions. 
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Table 2-20 lists the integer load instructions. 



Table 2-20. Integer Load Instructions 



Name 


Mnemonic 


Operand Syntax 


Load Byte and Zero 


Ibz 


rD,d(rA) 


Load Byte and Zero Indexed 


Ibzx 


rD,rA,rB 


Load Byte and Zero with Update 


Ibzu 


rD,d(rA) 


Load Byte and Zero with Update Indexed 


Ibzux 


rD,rA,rB 


Load Half Word and Zero 


Ihz 


rD,d(rA) 


Load Half Word and Zero Indexed 


Ihzx 


rD,rA,rB 


Load Half Word and Zero with Update 


Ihzu 


rD,d(rA) 


Load Half Word and Zero with Update Indexed 


Ihzux 


rD,rA,rB 


Load Half Word Algebraic 


lha 


rD,d(rA) 


Load Half Word Algebraic indexed 


lhax 


rD,rA,rB 


Load Half Word Algebraic with Update 


lhau 


rD,d(rA) 


Load Half Word Algebraic with Update Indexed 


lhaux 


rD,rA,rB 


Load Word and Zero 


Iwz 


rD,d(rA) 


Load Word and Zero Indexed 


Iwzx 


rD,rA,rB 


Load Word and Zero with Update 


Iwzu 


rD,d(rA) 


Load Word and Zero with Update Indexed 


Iwzux 


rD,rA,rB 



2.3.4.S.4 Integer Store Instructions 

For integer store instructions, the contents of rS are stored into the byte, half word, word, 
or double word in memory addressed by the effective address (EA). Many store instructions 
have an update form, in which rA is updated with the EA. For these forms, the following 
rules apply: 

• If r A 9^: 0, the EA is placed into r A. 

• If rS = r A, the contents of rS are copied to the target memory element, then the 
generated EA is placed into rA (rS). 

The 603e defines store with update instructions with rA = 0 and integer store instructions 
with the CR update option enabled (Rc field, bit 31, in the instruction encoding = 1) to be 
invalid forms. Table 2-21 provides a list of the integer store instructions for the 603e. 
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Table 2-21. Integer Store Instructions 



Name 


Mnemonic 


Operand Syntax 


Store Byte 


stb 


rS,d(rA) 


Store Byte Indexed 


stbx 


rS,rA,rB 


Store Byte with Update 


stbu 


rS,d(rA) 


Store Byte with Update Indexed 


stbux 


rS,rA,rB 


Store Half Word 


8th 


rS,d(rA) 


Store Half Word Indexed 


sthx 


rS.rA.rB 


Store Half Word with Update 


sthu 


rS,d(rA) 


Store Half Word with Update Indexed 


sthux 


rS,rA,rB 


Store Word 


stw 


rS,d(rA) 


Store Word Indexed 


stwx 


rS,rA,rB 


Store Word with Update 


stwu 


rS,d(rA) 


Store Word with Update Indexed 


stwux 


rS,rA,rB 



2.3.4.3.S Integer Load and Store with Byte-Reverse Instructions 

Table 2-22 describes integer load and store with byte-reverse instructions. When used in a 
PowerPC system operating with the default big-endian byte order, these instructions have 
the effect of loading and storing data in little-endian order. Likewise, when used in a 
PowerPC system operating with little-endian byte order, these instructions have the effect 
of loading and storing data in big-endian order. For more information about big-endian and 
little-endian byte ordering, see “Byte Ordering” in Chapter 3, “Operand Conventions,” in 
The Programming Environments Manual, 

Implementation Note — In some PowerPC implementations, load byte-reverse 
instructions (Ihbrx and Iwbrx) may have greater latency than other load instructions; 
however, these instructions operate with the same latency as other load instructions in the 
603e. 

Table 2-22. Integer Load and Store with Byte-Reverse Instructions 



Name 


Mnemonic 


Operand Syntax 


Load Half Word Byte-Reverse Indexed 


ihbrx 


rD,rA,rB 


Load Word Byte-Reverse Indexed 


iwbrx 


rD,rA,rB 


Store Half Word Byte-Reverse Indexed 


sthbrx 


rS,rA,rB 


Store Word Byte-Reverse Indexed 


stwbrx 


rS,rA,rB 
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2.3.4.S.6 Integer Load and Store Multiple Instructions 

The integer load/store multiple instructions are used to move blocks of data to and from the 
GPRs. In some implementations, these instructions are likely to have greater latency and 
take longer to execute, perhaps much longer, than a sequence of individual load or store 
instructions that produce the same results. 

Implementation Notes — ^The following describes the 603e implementation of the load/ 
store multiple instruction: 

• The load multiple and store multiple instructions may have operands that require 
memory accesses crossing a 4-Kbyte page boundary. As a result, these instructions 
may be interrupted by a DSI exception associated with the address translation of the 
second page. In this case, the 603e performs some or all of the memory references 
from the first page, and none of the memory references from the second page before 
taking the exception. On return from the DSI exception, the load or store multiple 
instruction will re-execute from the beginning. For additional information, refer to 
“DSI Exception (0x00300)” in Chapter 6, “Exceptions,” in The Programming 
Environments Manual. 

• The PowerPC architecture defines the load multiple word (Imw) instruction with r A 
in the range of registers to be loaded as an invalid form. It defines the load multiple 
and store multiple instructions with misaligned operands (that is, the EA is not a 
multiple of 4) to cause an alignment exception. The 603e defines the load multiple 
word (Imw) instruction with r A in the range of registers to be loaded as an invalid 
form. 

• The PowerPC architecture describes some preferred instruction forms for the integer 
load and store multiple instructions that may perform better than other forms in 
some implementations. None of these preferred forms have an effect on instruction 
performance in the 603e. 

When the 603e is operating with little-endian byte order, execution of a load or store 
multiple instruction causes the system alignment error handler to be invoked; see “Byte 
Ordering” in Chapter 3, “Operand Conventions,” in The Programming Environments 
Manual for more information. Table 2-23 lists the integer load and store multiple 
instructions for the 603e. 



Table 2-23. Integer Load and Store Multiple Instructions 



Name 


Mnemonic 


Operand Syntax 


Load Multiple Word 


Imw 


rD,d(rA) 


Store Multiple Word 


stmw 


rS,d(rA) 
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2.3.4.S.7 Integer Load and Store String Instructions 

The integer load and store string instructions allow movement of data from memory to 
registers or from registers to memory without concern for alignment. These instructions can 
be used for a short move between arbitrary memory locations or to initiate a long move 
between misaligned memory fields. 

When the 603e is operating with little-endian byte order, execution of a load or store string 
instruction causes the system alignment error handler to be invoked; see “Byte Ordering” 
in Chapter 3, “Operand Conventions,” in The Programming Environments Manual for 
more information. 

Table 2-24 lists the integer load and store string instructions. 



Table 2-24. Integer Load and Store String Instructions 



Name 


Mnemonic 


Operand Syntax 


Load String Word Immediate 


Iswi 


rD,rA,NB 


Load String Word Indexed 


Iswx 


rD,rA,rB 


Store String Word Immediate 


stswi 


rS,rA,NB 


Store String Word Indexed 


stswx 


rS,rA,rB 



Load string and store string instructions may involve operands that are not word-aligned. 
As described in “Alignment Exception (0x00600)” in Chapter 6, “Exceptions,” in The 
Programming Environments Manual a misaligned string operation suffers a performance 
penalty compared to a word-aligned operation of the same type. 

When a string operation crosses a 4-Kbyte boundary, the instruction may be interrupted by 
a DSI exception associated with the address translation of the second page. In this case, the 
603e performs some or all memory references from the first page and none from the second 
before taking the exception. On return from the DSI exception, the load or store string 
instruction will re-execute from the beginning. For more information, refer to “DSI 
Exception (0x00300)” in Chapter 6, “Exceptions,” in The Programming Environments 
Manual. 

Implementation Note — If rA is in the range of registers to be loaded for a Load String 
Word Immediate (Iswi) instruction or if either rA or rB is in the range of registers to be 
loaded for a Load String Word Indexed (Iswx) instruction, the PowerPC architecture 
defines the instruction to be of an invalid form. In addition, the Iswx and stswx instructions 
that specify a string length of zero are defined to be invalid by the PowerPC architecture. 
However, neither of these cases holds true for the 603e which treats these cases as valid 
forms. 
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2.3.4.3.S Floating-Point Load and Store Address Generation 

Floating-point load and store operations generate effective addresses using the register 
indirect with immediate index addressing mode and register indirect with index addressing 
mode, the details of which are described below. Floating-point loads and stores are not 
supported for direct-store accesses. The use of the floating-point load and store operations 
for direct-store accesses will result in a DSI exception. 

2.3.4.3.9 Floating-Point Load Instructions 

There are two forms of the floating-point load instruction — single-precision and double- 
precision operand formats. Because the FPRs support only the floating-point double- 
precision format, single-precision floating-point load instructions convert single-precision 
data to double-precision format before loading the operands into the target FPR. This 
conversion is described fully in “Floating-Point Load Instructions” in Appendix D, 
“Floating-Point Models,” in The Programming Environments Manual. 

Implementation Note — ^The PowerPC architecture defines load with update instructions 
with rA = 0 as an invalid form; however, the 603e treats this case as a valid form. 

Table 2-25 provides a list of the floating-point load instructions. 



Table 2-25. Floating-Point Load Instructions 



Name 


Mnemonic 


Operand Syntax 


Load Floating-Point Single 


Ifs 


frD,d(rA) 


Load Floating-Point Single Indexed 


Ifsx 


trD,rA,rB 


Load Floating-Point Single with Update 


Ifsu 


frD,d(rA) 


Load Floating-Point Single with Update Indexed 


Ifsux 


frD,rA,rB 


Load Floating-Point Double 


Ifd 


frD,d(rA) 


Load Floating-Point Double Indexed 


ifdx 


frD,rA,rB 


Load Floating-Point Double with Update 


Ifdu 


frD,d(rA) 


Load Floating-Point Double with Update Indexed 


Ifdux 


frD,rA,rB 



2.3.4.3.10 Floating-Point Store Instructions 

There are three basic forms of the store instruction — single-precision, double-precision, 
and integer. The integer form is supported by the optional stfiwx instruction. Because the 
FPRs support only floating-point, double-precision format for floating-point data single- 
precision floating-point store instructions convert double-precision data to single-precision 
format before storing the operands. The conversion steps are described fully in “Floating- 
Point Store Instructions” in Appendix D, “Floating-Point Models,” in The Programming 
Environments Manual. 

Implementation Note — ^The PowerPC architecture defines store with update instructions 
with r A = 0 as an invalid form; however, the 603e treats this case as valid. 
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Table 2-26 provides a list of the floating-point store instructions. 



Table 2-26. Floating-Point Store Instructions 



Name 


Mnemonic 


Operand Syntax 


Store Floating-Point Single 


stfs 


frS,d(rA) 


Store Floating-Point Single Indexed 


stfsx 


frS,rA,rB 


Store Floating-Point Single with Update 


stfsu 


frS,d(rA) 


Store Floating-Point Single with Update Indexed 


stfsux 


frS,rA,rB 


Store Floating-Point Double 


stfd 


frS,d(rA) 


Store Floating-Point Double Indexed 


stfdx 


frS,rA,rB 


Store Floating-Point Double with Update 


stfdu 


frS,d(rA) 


Store Floating-Point Double with Update Indexed 


stfdux 


frS,rA,rB 


Store Floating-Point as Integer Word Indexed 


stfiwx 


frS,rA,rB 



2.S.4.4 Branch and Flow Control Instructions 

Branch instructions are executed by the branch processing unit (BPU). The BPU receives 
branch instructions from the fetch unit and performs condition register (CR) look-ahead 
operations on conditional branches to resolve them early, achieving the effect of a zero- 
cycle branch in many cases. 

Some branch instructions can redirect instruction execution conditionally based on the 
value of bits in the CR. When the branch processor encounters one of these instructions, it 
scans the execution pipelines to determine whether an instruction in progress may affect the 
particular CR bit. If no interlock is found, the branch can be resolved immediately by 
checking the bit in the CR and taking the action defined for the branch instruction. 

If an interlock is detected, the branch is considered unresolved and the direction of the 
branch is predicted using static branch prediction as described in “Conditional Branch 
Control” in Chapter 4, “Addressing Modes and Instruction Set Summary,” in The 
Programming Environments Manual, The interlock is monitored while instructions are 
fetched for the predicted branch. When the interlock is cleared, the branch processor 
determines whether the prediction was correct based on the value of the CR bit. If the 
prediction is correct, the branch is considered completed and instruction fetching continues. 
If the prediction is incorrect, the fetched instructions are purged, and instruction fetching 
continues along the alternate path. See Chapter 8, “Instruction Timing,” in The 
Programming Environments Manual for more information about how branches are 
executed. 

2.3.4.4.1 Branch Instruction Address Calculation 

Branch instructions can alter the sequence of instruction execution. Instruction addresses 
are always assumed to be word aligned; the processor ignores the two low-order bits of the 
generated branch target address. 
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Branch instructions compute the effective address (EA) of the next instruction address 
using the following addressing modes: 

• Branch relative 

• Branch conditional to relative address 

• Branch to absolute address 

• Branch conditional to absolute address 

• Branch conditional to link register 

• Branch conditional to count register 

2.S.4.4.2 Branch Instructions 

Table 2-27 lists the branch instructions provided by the PowerPC processors. To simplify 
assembly language programming, a set of simplified mnemonics and symbols is provided 
for the most frequently used forms of branch conditional, compare, trap, rotate and shift, 
and certain other instructions. See Appendix F, “Simplified Mnemonics,” in The 
Programming Environments Manual for a list of simplified mnemonic examples. 



Table 2-27. Branch Instructions 



Name 


Mnemonic 


Operand Syntax 


Branch 


b(ba bl bla) 


target_addr 


Branch Conditional 


be (bca bcl bcia) 


BO,BI,target_addr 


Branch Conditional to Link Register 


bclr(bclri) 


BO,BI 


Branch Conditional to Count Register 


beetr (bcctrl) 


BO,BI 



2.3.4.4.3 Condition Register Logical Instructions 

Condition register logical instructions, shown in Table 2-28, and the Move Condition 
Register Field (mcrf) instruction are also defined as flow control instructions, although they 
are executed by the system register unit (SRU). Most instructions executed by the SRU are 
completion-serialized to maintain system state; that is, the instruction is held for execution 
in the SRU until all prior instructions issued have completed. 



Table 2-28. Condition Register Logical Instructions 



Name 


Mnemonic 


Operand Syntax 


Condition Register AND 


crand 


crbD,crbA,crbB 


Condition Register OR 


cror 


crbD,crbA,crbB 


Condition Register XOR 


crxor 


crbD,crbA,crbB 


Condition Register NAND 


ernand 


crbD,crbA,crbB 


Condition Register NOR 


ernor 


crbD,crbA,crbB 


Condition Register Equivalent 


creqv 


crbD,crbA,crbB 


Condition Register AND with Complement 


crandc 


crbD,crbA,crbB 
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Table 2-28. Condition Register Logical Instructions (Continued) 



Name 


Mnemonic 


Operand Syntax 


Condition Register OR with Complement 


crorc 


crbD,crbA,crbB 


Move Condition Register Field 


mcrf 


crfD,crfS 



Note that if the LR update option is enabled for any of these instructions, these forms of the 
instructions are invalid in the 603e. 

2.3.4.S Trap Instructions 

The trap instructions shown in Table 2-29 are provided to test for a specified set of 
conditions. If any of the conditions tested by a trap instruction are met, the system trap 
handler is invoked. If the tested conditions are not miet, instruction execution continues 
normally. 



Table 2-29. Trap Instructions 



Name 


Mnemonic 


Operand Syntax 


Trap Word Immediate 


twi 


TO, rA, SIMM 


Trap Word 


tw 


TO,rA,rB 



See Appendix F, “Simplified Mnemonics,” in The Programming Environments Manual for 
a complete set of simplified mnemonics. 

2.3.4.6 Processor ControMnstructions 

Processor control instructions are used to read from and write to the condition register 
(CR), machine state register (MSR), and special-purpose registers (SPRs), and to read from 
the time base register (TBU or TBL). 

2.3.4.6.1 Move to/from Condition Register Instructions 

Table 2-37 lists the instructions provided by the 603e for reading from or writing to the CR. 



Table 2-30. Move to/from Condition Register Instructions 



Name 


Mnemonic 


Operand Syntax 


Move to Condition Register Fields 


mtcrf 


CRM,rS 


Move to Condition Register from XER 


mcrxr 


crfD 


Move from Condition Register 


mfcr 


rD 
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2.3.4.7 Memory Synchronization Instructions— UlSA 

Memory synchronization instructions control the order in which memory operations are 
completed with respect to asynchronous events, and the order in which memory operations 
are seen by other processors or memory access mechanisms. See Chapter 3, “Instruction 
and Data Cache Operation,” for additional information about these instructions and about 
related aspects of memory synchronization. 

The sync instruction delays execution of subsequent instructions until previous instructions 
have completed to the point that they can no longer cause an exception and until all 
previous memory accesses are performed globally; the sync operation is not broadcast onto 
the 603e bus interface. Additionally all load and store cache/bus activities initiated by prior 
instructions are completed. Touch load operations (debt and debtst) are required to 
complete at least through address translation, but not required to complete on the bus. 

The functions performed by the sync instruction normally take a significant amount of time 
to complete; as a result, frequent use of this instruction may adversely affect performance. 
In addition, the number of cycles required to complete a sync instruction depends on 
system parameters and on the processor's state when the instruction is issued. 

The proper paired use of the Iwarx and stwex. instructions allows programmers to emulate 
common semaphore operations such as “test and set,” “compare and swap,” “exchange 
memory,” and “fetch and add.” Examples of these semaphore operations can be found in 
Appendix E, “Synchronization Programming Examples,” in The Programming 
Environments Manual. The Iwarx instruction must be paired with an stwex. instruction 
with the same effective address used for both instructions of the pair. Note that the 
reservation granularity is 32 bytes. 

The concept behind the use of the Iwarx and stwex. instructions is that a processor may 
load a semaphore from memory, compute a result based on the value of the semaphore, and 
conditionally store it back to the same location (only if that location has not been modified 
since it was first read), and determine if the store was successful. The conditional store is 
performed based upon the existence of a reservation established by the preceding Iwarx 
instruction. If the reservation exists when the store is executed, the store is performed and 
a bit is set in the CR. If the reservation does not exist when the store is executed, the target 
memory location is not modified and a bit is cleared in the CR. 

If the store was successful, the sequence of instructions from the read of the semaphore to 
the store that updated the semaphore appear to have been executed atomically (that is, no 
other processor or mechanism modified the semaphore location between the read and the 
update), thus providing the equivalent of a real atomic operation. However, in reality, other 
processors may have read from the location during this operation. In the 603e, the 
reservations are made on behalf of aligned 32-byte sections of the memory address space. 

The Iwarx and stwex. instructions require the EA to be aligned. Exception handling 
software should not attempt to emulate a misaligned Iwarx or stwex. instruction, because 
there is no correct way to define the address associated with the reservation. 
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In general, the Iwarx and stwcx. instructions should be used only in system programs, 
which can be invoked by applicatioti programs as needed. 

At most, one reservation exists simultaneously on any processor. The address associated 
with the reservation can be changed by a subsequent Iwarx instruction. The conditional 
store is performed based upon the existence of a reservation established by the preceding 
Iwarx regardless of whether the address generated by the Iwarx matches that generated by 
the stwcx. instruction. A reservation held by the processor is cleared by one of the 
following: 

• Executing an stwcx. instruction to any address 

• Attempt by some other device to modify a location in the reservation granularity 
(32 bytes) 

The Iwarx and stwcx. instructions in write-through access mode do not cause a DSI 
exception. 

Table 2-3 1 lists the UISA memory synchronization instructions for the 603e. 



Table 2-31. Memory Synchronization Instructions — UISA 



Name 


Mnemonic 


Operand Syntax 


Load Word and Reserve Indexed 


Iwarx 


rD,rA,rB 


Store Word Conditional Indexed 


stwcx. 


rS,rA,rB 


Synchronize 


sync 


— 



2.3.5 PowerPC VEA Instructions 

The PowerPC VEA describes the semantics of the memory model that can be assumed by 
software processes, and includes descriptions of the cache model, cache-control 
instructions, address aliasing, and other related issues. 

2.3.5.1 Processor Control Instructions 

In addition to the move to condition register instructions specified by the UISA, the VEA 
defines the Move from Time Base (mftb) instruction for reading the contents of the time 
base register. The mftb is a user-level instruction, it is shown in Table 2-32. 

Simplified mnemonics are provided for the mftb instruction so it can be coded with the 
TBR name as part of the mnemonic rather than requiring it to be coded as an operand. The 
mftb instruction serves as both a basic and simplified mnemonic. Assemblers recognize an 
mftb mnemonic with two operands as the basic form, and an mftb mnemonic with one 
operand as the simplified form. Simplified mnemonics are also provided for Move from 
Time Base Upper (mftbu), which is a variant of the mftb instruction rather than of mfspr. 
The 603e ignores the extended opcode differences between mftb and mfspr by ignoring 
bit 25 of both instructions and treating them both identically. For more information refer to 
Appendix F, “Simplified Mnemonics,” in The Programming Environments Manual. 
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Table 2-32. Move from Time Base Instruction 



Name 


Mnemonic 


Operand Syntax 


Move from Time Base 


mftb 


rD, TBR 



2.3.S.2 Memory Synchronization Instructions — VEA 

Memory synchronization instructions control the order in which memory operations are 
completed with respect to asynchronous events, and the order in which memory operations 
are seen by other processors or memory access mechanisms. See Chapter 3, “Instruction 
and Data Cache Operation,” for additional information about these instructions and about 
related aspects of memory synchronization. 

Implementation Notes — ^The following describes how the 603e handles memory 
synchronization in the VEA. 

• The Instruction Synchronize (isync) instruction causes the 603e to discard all 
prefetched instructions, wait for any preceding instructions to complete, and then 
branch to the next sequential instruction (which has the effect of clearing the 
pipeline behind the isync instruction). 

• The Enforce In-Order Execution of I/O (eieio) instruction is used to ensure memory 
reordering of noncacheable memory access. Since the 603e does not reorder 
noncacheable memory accesses, the eieio instruction is treated as a no-op. 

Table 2-3 1 lists the VEA memory synchronization instructions for the 603e. 



Table 2-33. Memory Synchronization Instructions— VEA 



Name 


Mnemonic 


Operand Syntax 


Enforce In-Order Execution of I/O 


eieio 


— 


Instruction Synchronize 


isync 


— 



2.3.S.3 Memory Control Instructions — VEA 

Memory control instructions include the following types: 

• Cache management instructions 

• Segment register manipulation instructions 

• Translation lookaside buffer management instructions 

This section describes the user-level cache management instructions defined by the VEA. 
See Section 2.3. 6.3, “Memory Control Instructions — OEA,” for information about 
supervisor-level cache, segment register manipulation, and translation lookaside buffer 
management instructions. 

The instructions listed in Table 2-34 provide user-level programs the ability to manage on- 
chip caches when they exist. 
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As with other memory-related instructions, the effect of the cache management instructions 
on memory are weakly ordered. If the programmer needs to ensure that cache or other 
instructions have been performed with respect to all other processors and system 
mechanisms, a sync instruction must be placed in the program following those instructions. 

Note that when data address translation is disabled (MSR[DR] = 0), the Data Cache Block 
Set to Zero (dcbz) instruction allocates a cache block in the cache and may not verify that 
the physical address is valid. If a cache block is created for an invalid physical address, a 
machine check condition may result when an attempt is made to write that cache block back 
to memory. The cache block could be written back as a result of the execution of an 
instruction that causes a cache miss and the invalid addressed cache block is the target for 
replacement or a Data Cache Block Store (dcbst) instruction. 

Note that any cache control instruction that generates an effective address that corresponds 
to a direct-store segment (SR[T] = 1) is treated as a no-op. 

Table 2-34 lists the cache instructions that are accessible to user-level programs. 



Table 2-34. User-Level Cache Instructions 



Name 


Mnemonic 


Operand Syntax 


Data Cache Block Touch 


debt 


rA,rB 


Data Cache Block Touch for Store 


debtst 


rA,rB 


Data Cache Block Set to Zero 


dcbz 


rA,rB 


Data Cache Block Store 


dcbst 


rA,rB 


Data Cache Block Flush 


debt 


rA,rB 


Instruction Cache Block Invalidate 


iebi 


rA,rB 



2.3.S.4 External Control Instructions 

The external control instructions allow a user-level program to communicate with a special- 
purpose device. Executing these instructions when MSR[DR] = 0 causes a programming 
error, and the physical address on the bus is undefined. Executing these instructions to a 
direct-store segment causes a DSI exception. The external control instructions are listed in 
Table 2-35. 



Table 2-35. External Control Instructions 



Name 


Mnemonic 


Operand Syntax 


External Control In Word Indexed 


eciwx 


rD,rA,rB 


External Control Out Word Indexed 


ecowx 


rS,rA,rB 
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2.3.6 PowerPC OEA Instructions 

The PowerPC OEA includes the structure of the memory management model, supervisor- 
level registers, and the exception model. 

2.3.6.1 System Linkage Instructions 

This section describes the system linkage instructions (see Table 2-36). The sc instruction 
is a user-level instruction that permits a user program to call on the system to perform a 
service and causes the processor to take an exception. The Return from Interrupt (rfi) 
instruction is a supervisor-level instruction that is useful for returning from an exception 
handler. 



Table 2-36. System Linkage Instructions 



Name 


Mnemonic 


Operand Syntax 


System Call 


sc 


— 


Return from Interrupt 


rfi 


— 



2.3.G.2 Processor Control Instructions — OEA 

Processor control instructions are used to read from and write to the condition register 
(CR), machine state register (MSR), and special-purpose registers (SPRs), and to read from 
the time base register (TBU or TBL). 

2.3.6.2.1 Move to/from Machine State Register instructions 

Table 2-37 lists the instructions provided by the 603e for reading from or writing to the 
MSR. 



Table 2-37. Move to/from Machine State Register Instructions 



Name 


Mnemonic 


Operand Syntax 


Move to Machine State Register 


mtmsr 


rS 


Move from Machine State Register 


mfmsr 


rD 



2.3.6.2.2 Move to/from Special-Purpose Register Instructions 

Simplified mnemonics are provided for the mtspr and mfspr instructions so they can be 
coded with the SPR name as part of the mnemonic rather than as a numeric operand. See 
Appendix F, “Simplified Mnemonics,” in The Programming Environments Manual for 
simplified mnemonic examples. The mtspr and mfspr instructions are shown in 
Table 2-38. 

Table 2-38. Move to/from Special-Purpose Register Instructions 



Name 


Mnemonic 


Operand Syntax 


Move to Special-Purpose Register 


mtspr 


SPR,rS 


Move from Special-Purpose Register 


mfspr 


rD,SPR 
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For mtspr and mfspr instructions, the SPR number coded in assembly language does not 
appear directly as a 10-bit binary number in the instruction. The number coded is split into 
two 5-bit halves that are reversed in the instruction encoding, with the high-order 5 bits 
appearing in bits 16-20 of the instruction encoding and the low-order 5 bits in bits 11-15. 

If the SPR field contains any value other than one of the values shown in Table 2-39, either 
the program exception handler is invoked or the results are boundedly undefined. 



Table 2-39. SPR Encodings for PowerPC 603e-Defined 
Registers (mfspr) 



SPR* 


Register Name 


Decimal 


spr[5-^9] 


spr[0-4J 


976 


11110 


10000 


DMISS 


977 


11110 


10001 


DCMP 


978 


11110 


10010 


HASH1 


979 


11110 


10011 


HASH2 


980 


11110 


10100 


IMISS 


981 


11110 


10101 


ICMP 


982 


11110 


10110 


RPA 


1008 


11111 


10000 


HIDO 


1009 


11111 


10001 


MIDI 


1010 


11111 


10010 


lABR 



Note that the order of the two 5-bit halves of the SPR number is reversed 
compared with actual instruction coding. 

For mtspr and mfspr instructions, the SPR number coded in assembly 
language does not appear directly as a 10-bit binary number in the instruction. 

The number coded is split into two 5-bit halves that are reversed in the 
instruction, with the high-order 5 bits appearing in bits 1 6-20 of the instruction 
and the low-order 5 bits in bits 1 1-15. 

Implementation Note — The 603e ignores the extended opcode differences between mftb 
and mfspr by ignoring TB [25] and treating both instructions identically. 

2.S.6.3 Memory Control Instructions— OEA 

This section describes memory control instructions, which include the following types: 

• Cache management instructions 

• Segment register manipulation instructions 

• Translation lookaside buffer management instructions 
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2.3.6.3.1 Supervisor-Level Cache Management Instruction 

Table 2-40 lists the only supervisor-level cache management instruction. See 
Section 2.3.S.3, “Memory Control Instructions — ^VEA,” for a description of cache 
instructions that provide user-level programs the ability to manage the on-chip caches. If 
the effective address references a direct-store segment, the instruction is treated as a no-op. 

When data translation is disabled, MSR[DR] = 0, the dcbz instruction establishes a block 
in the cache and may not verify that the physical address is valid. If a block is created for 
an invalid real address, a machine check exception may result when an attempt is made to 
write that block back to memory. The block could be written back as the result of the 
execution of an instruction that causes a cache miss and the invalid address block is the 
target for replacement or as the result of a dcbst instruction. 



Table 2-40. Supervisor-Level Cache Management Instruction 



Name 


Mnemonic 


Operand Syntax 


Data Cache Block Invalidate 


dcbi 


rA,rB 



2.3.6.3.2 Segment Register Manipulation Instructions 

The instructions listed in Table 2-41 provide access to the segment registers for the 603e. 
These instructions operate completely independently of the MSR[IR] and MSR[DR] bit 
settings. Refer to “Synchronization Requirements for Special Registers and TLBs” in 
Chapter 2, “Register Set,” in The Programming Environments Manual for serialization 
requirements and other recommended precautions to observe when manipulating the 
segment registers. 



Table 2-41. Segment Register Manipulation Instructions 



Name 


Mnemonic 


Operand Syntax 


Move to Segment Register 


mtsr 


SR,rS 


Move to Segment Register Indirect 


mtsrin 


rS,rB 


Move from Segment Register 


mfsr 


rD,SR 


Move from Segment Register Indirect 


mfsrin 


rD,rB 



2.3.6.3.3 Translation Lookaside Buffer Management Instructions 

The address translation mechanism is defined in terms of segment descriptors and page 
table entries (PTEs) used by PowerPC processors to locate the effective to physical address 
mapping for a particular access. The PTEs reside in page tables in memory. As defined for 
32-bit implementations by the PowerPC architecture, segment descriptors reside in 16 on- 
chip segment registers. 
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Implementation Note — ^The 603e provides the ability to invalidate a TLB entry. The TLB 
Invalidate Entry (tlbie) instruction invalidates the TLB entry indexed by the EA, and 
operates on both the instruction and data TLBs simultaneously invalidating four TLB 
entries (both sets in each TLB). The index corresponds to bits 15-19 of the EA. To 
invalidate all entries within both TLBs, 32 tlbie instructions should be issued, incrementing 
this field by one each time. 

The 603e provides two implementation-specific instructions (tlbld and tlbli) that are used 
by software table search operations following TLB misses to load TLB entries on-chip. 

For more information on tlbld and tlbli refer to Section 2.3.8, “PowerPC 603e 
Implementation-Specific Instructions.” 

Note that the tibia instruction is not implemented on the 603e. 

Refer to Chapters, “Memory Management” for more information about the TLB 
operations for the 603e. Table 2-42 lists the TLB instructions. 

Table 2-42. Translation Lookaside Buffer Management Instructions 



Name 


Mnemonic 


Operand Syntax 


T LB Invalidate Entry 


tlbie 


rB 


TLB Synchronize 


tibsync 


— 


Load Data TLB Entry 


tlbld 


rB 


Load Instruction TLB Entry 


tlbli 


rB 



Because the presence and exact semantics of the translation lookaside buffer management 
instructions is implementation-dependent, system software should incorporate uses of the 
instructions into subroutines to maximize compatibility with programs written for other 
processors. 

For more information on the PowerPC instruction set, refer to Chapter 4, “Addressing 
Modes and Instruction Set Summary,” and Chapter 8, “Instruction Set,” in The 
Programming Environments Manual, 
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2.3.7 Recommended Simplified Mnemonics 

To simplify assembly language programs, a set of simplified mnemonics is provided for 
some of the most frequently used operations (such as no-op, load immediate, load address, 
move register, and complement register). PowerPC compliant assemblers provide the 
simplified mnemonics listed in “Recommended Simplified Mnemonics” in Appendix F, 
“Simplified Mnemonics,” in The Programming Environments Manual and listed with 
some of the instruction descriptions in this chapter. Programs written to be portable across 
the various assemblers for the PowerPC architecture should not assume the existence of 
mnemonics not described in this document. 

For a complete list of simplified mnemonics, see Appendix F, “Simplified Mnemonics,” in 
The Programming Environments Manual. 

2.3.8 PowerPC 603e Impiementation-Specific Instructions 

This section provides a detailed look at the two 603e implementation-specific 
instructions — tlbld and tlbli. 
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tibid tibid 

Load Data TLB Entry Integer Unit 

tibid rB 



in Reserved 



31 


00000 


00000 


B 


978 





0 5 6 10 11 15 16 20 21 30 31 



EA<-(rB) 

TLB entry created from DCMP and RPA 

DTLB entiy selected by EA[15-19] and SRR1[WAY] <- created TLB entry 

The EA is the contents of rB. The tibid instruction loads the contents of the data PTE 
compare (DCMP) and required physical address (RPA) registers into the first word of the 
selected data TLB entry. The specific DTLB entry to be loaded is selected by the EA and 
theSRRl[WAY]bit. 

The tibid instruction should only be executed when address translation is disabled 
(MSR[IR] = 0 and MSR[DR] = 0). 

Note that it is possible to execute the tibid instruction when address translation is enabled; 
however, extreme caution should be used in doing so. If data address translation is set 
(MSR[DR] = 1) tibid must be preceded by a sync instruction and succeeded by a context 
synchronizing instruction. 

Note also that care should be taken to avoid modification of the instruction TLB entries that 
translate current instruction prefetch addresses. 

This is a supervisor-level instruction; it is also a 603e-specific instruction, and not part of 
the PowerPC instruction set. 

Other registers altered: 

• None 
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tibli 

Load Instruction TLB Entry 
tlbld rB 



tibli 

Integer Unit 



r~1 Reserved 



31 


00000 


00000 


B 


1010 


III 



0 5 6 10 11 15 16 20 21 30 31 



EA <- (rB) 

TLB entry created from ICMP and RPA 

ITLB entry selected by EA[15-19] and SRR1[WAY] <- created TLB entry 

The EA is the contents of rB, The tibli instruction loads the contents of the instruction PTE 
compare (ICMP) and required physical address (RPA) registers into the first word of the 
selected instruction TLB entry. The specific ITLB entry to be loaded is selected by the EA 
and the SRR1[WAY] bit. 

The tibli instruction should only be executed when address translation is disabled 
(MSR[IR] = 0 and MSR[DR] = 0). 

Note that it is possible to execute the tlbld instruction when address translation is enabled; 
however, extreme caution should be used in doing so. If instruction address translation is 
set (MSR[IR] = 1), tibli must be followed by a context synchronizing instruction such as 
isync or rfi. 

Note also that care should be taken to avoid modification of the instruction TLB entries that 
translate current instruction prefetch addresses. 

This is a supervisor-level instruction; it is also a 603e-specific instruction, and not part of 
the PowerPC instruction set. 

Other registers altered: 

• None 
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Chapter 3 

Instruction and Data Cache Operation 

The PowerPC 603e microprocessor provides two 16-Kbyte, four-way set associative 
caches to allow the registers and execution units rapid access to instructions and data. Both 
the instruction and data caches are tightly coupled to the 603e’s bus interface unit (BIU) to 
allow efficient access to the system memory controller and other bus masters. The 603e’s 
load/store unit (LSU) is also directly coupled to the data cache to allow the efficient 
movement of data to and from the general-purpose and floating-point registers. 

Both the instruction and data caches have a block size of 32 bytes, and the data cache blocks 
can be snooped, or cast-out when the cache block is reloaded. The data cache is designed 
to adhere to a write-back policy, but the 603e allows control of cacheability, write-back 
policy, and memory coherency at the page and block level. Both caches use a least recently 
used (LRU) replacement policy. Burst fill operations to the caches result from cache misses, 
or in the case of the data cache, cache block write-back operations to memory. Note that in 
the PowerPC architecture, the term cache block, or simply block when used in the context 
of cache implementations, refers to the unit of memory at which coherency is maintained. 
For the 603e this is the eight-word cache line. This value may be different for other 
PowerPC implementations. 

The data cache is configured as 128 sets of four blocks. Each block consists of 32 bytes, 
two state bits, and an address tag. The two state bits implement the three-state MEI 
(modified-exclusive-invalid) protocol, a coherent subset of the standard four-state MESI 
protocol. The 603e’s on-chip data cache tags are single-ported, and load or store operations 
must be arbitrated with snoop accesses to the data cache tags. Load or store operations can 
be performed to the cache on the clock cycle immediately following a snoop access if the 
snoop misses; snoop hits may block the data cache for two or more cycles, depending on 
whether a copyback to main memory is required. 

The instruction cache also consists of 128 sets of four blocks, and each block consists of 32 
bytes, an address tag, and a valid bit. The instruction cache is only written as a result of a 
block fill operation on a cache miss. The instruction cache is not snooped, and cache 
coherency must be maintained by software. A fast hardware invalidation capability is 
provided to support cache maintenance. 
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The load/store unit provides the data transfer interface between the data cache and the 
GPRs and the FPRs. The load/store unit provides all logic required to calculate effective 
addresses, handle data alignment to and from the data cache, and provides sequencing for 
load and store string and multiple operations. As shown in Figure 1-1, the caches provide 
a 64-bit interface to the instruction fetcher and load/store unit. Write operations to the data 
cache can be performed on a byte, half-word, word, or double-word basis. 

The 603e’s bus interface unit receives requests for bus operations from the instruction and 
data caches, and executes the operations per the 603e bus protocol. The BIU provides 
address queues, and prioritization and bus control logic. The BIU also captures snoop 
addresses for data cache, address queue, and memory reservation (Iwarx and stwcx. 
instruction) operations. The BIU also contains a touch load address buffer used for address 
compares during load or store operations. All the data for the corresponding address queues 
(load and store data queues) is located in the data cache. The data queues are considered 
temporary storage for the cache and not part of the BIU. 

On a cache miss, the 603e’s cache blocks are filled in four beats of 64 bits each. The burst 
fill is performed as a “critical-double-word-first” operation; the critical double word is 
simultaneously written to the cache and forwarded to the requesting unit, thus minimizing 
stalls due to cache fill latency. Note that the cache being filled cannot be accessed internally 
until the fill completes. 

When address translation is enabled, the memory access is performed under the control of 
the page table entry used to translate the effective address. Each page table entry contains 
four mode control bits, W, I, M, and G, that specify the storage mode for all accesses 
translated using that particular page table entry. The W (write-through) and I (caching- 
inhibited) bits control how the processor executing the access uses its own cache. The M 
(memory coherence) bit specifies whether the processor executing the access must use the 
MEI (modified, exclusive, or invalid) cache coherence protocol to ensure all copies of the 
addressed memory location are kept consistent. The G (guarded memory) bit controls 
whether out-of-order data and instruction fetching is permitted. 

The 603e maintains data cache coherency in hardware by coordinating activity between the 
data cache, the memory system, and the bus interface logic. As bus operations are 
performed on the bus by other bus masters, the 603e bus snooping logic monitors the 
addresses that are referenced. These addresses are compared with the addresses resident in 
the data cache. If there is a snoop hit, the 603e’s bus snooping logic responds to the bus 
interface with the appropriate snoop status (for example, an ARTRY). Additional snoop 
action may be forwarded to the cache as a result of a snoop hit in some cases (a cache push 
of modified data, or a cache block invalidation). 

The 603e supports a fully-coherent 4-Gbyte physical memory address space. Bus snooping 
is used to drive the MEI three-state cache-coherency protocol that ensures the coherency of 
global memory with respect to the processor’s cache. The MEI protocol is described in 
Section 3.6.1, “MEI State Definitions.” 
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This chapter describes the organization of the 603e’s on-chip instruction and data caches, 
the MEI cache coherency protocol, cache control instructions, various cache operations, 
and the interaction between the cache, load/store unit, and the bus interface unit. 

3.1 Instruction Cache Organization and Control 

The instruction fetcher accesses the instruction cache frequently in order to sustain the high 
throughput provided by the six-entry instruction dispatch queue. 

3.1.1 Instruction Cache Organization 

The organization of the instruction cache is shown in Figure 3-1 . Each cache block contains 
eight contiguous words from memory that are loaded from an eight-word boundary (that is, 
bits A27-A31 of the logical (effective) addresses are zero); as a result, cache blocks are 
aligned with page boundaries. 

Note that address bits A20-A26 provide an index to select a set. Bits A27-A3 1 select a byte 
within a block. The tags consists of bits PA0-PA19. Address translation occurs in parallel, 
such that higher-order bits (the tag bits in the cache) are physical. Note that the replacement 
algorithm is strictly an LRU algorithm; that is, the least recently used block is filled with 
new instructions on a cache miss. 



Block 0 
Block 1 
Block 2 
Block 3 




Figure 3-1 . Instruction Cache Organization 
3.1.2 Instruction Cache Fill Operations 

The 603e’s instruction cache blocks are loaded in four beats of 64 bits each, with the critical 
double word loaded first. The instruction cache is blocked to internal accesses after a cache 
miss until the fill from memory completes. On a cache miss, the critical and following 
double words read from memory are simultaneously written to the instruction cache and 
forwarded to the dispatch queue, thus minimizing stalls due to cache fill latentcy. There is 
no snooping of the instruction cache. 
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3.1.3 Instruction Cache Control 

In addition to instruction cache control instructions, the 603e provides three control bits in 
the HIDO register for the control of invalidating, disabling, and locking the instruction 
cache. In addition, the WIMG bits in the page tables also affect the cacheability of pages 
and whether or not the pages are considered guarded. 

3.1. 3.1 Instruction Cache Invalidation 

While the 603e’s instruction cache is automatically invalidated during a power-on or hard 
reset, assertion of the soft reset signal does not cause instruction cache invalidation. 
Software may invalidate the contents of the instruction cache using the instruction cache 
flash invalidate (ICFI) control bit in the HIDO register. Flash invalidation of the instruction 
cache is accomplished by setting and clearing the ICFI bit with two consecutive move to 
SPR operations to the HIDO register. 

3.1. 3.2 Instruction Cache Disabling 

The instruction cache may be disabled through the use of the instruction cache enable (ICE) 
control bit in the HIDO register. When the instruction cache is in the disabled state, the 
cache tag state bits are ignored, and all accesses are propagated to the bus as single-beat 
transactions. The ICE bit is cleared during a power-on reset, causing the instruction cache 
to be disabled. The setting of the ICE bit must be preceded by an isync instruction to 
prevent the cache from being enabled or disabled while an instruction access is in progress. 

3.1 .3.3 Instruction Cache Locking 

The contents of instruction cache may be locked through the use of the ILOCK control bit 
in the HIDO register. A locked instruction cache supplies instructions normally on a cache 
hit, but cache misses are treated as cache-inhibited accesses. The cache inhibited (Cl) signal 
is asserted if a cache access misses into a locked cache. The setting of the ILOCK bit in 
HIDO must be preceded by an isync instruction to prevent the instruction cache from being 
locked during an instruction access. 

3.2 Data Cache Organization and Control 

The data cache supplies data to the GPRs and FPRs by means of the load/store unit, and 
provides buffers for load and store bus operations. The data cache also provides storage for 
the cache tags required for memory coherency and performs the cache block replacement 
LRU function. 

3.2.1 Data Cache Organization 

The organization of the data cache is shown in Figure 3-2. Each cache block contains eight 
contiguous words from memory that are loaded from an eight- word boundary (that is, bits 
A27-A3 1 of the logical (effective) addresses are zero); as a result, cache blocks are aligned 
with page boundaries. 
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Note that address bits A20-A26 provide an index to select a set. Bits A27-A3 1 select a byte 
within a block. The tags consists of bits PA0-PA19. Address translation occurs in parallel, 
such that higher-order bits (the tag bits in the cache) are physical. Note that the replacement 
algorithm is strictly an LRU algorithm; that is, the least recently used block is filled with 
new data on a cache miss. 



Block 

Block 

Block 

Block 



0 

1 

2 

3 




Figure 3-2. Data Cache Organization 

3.2.2 Data Cache Fill Operations 

The 603e’s data cache blocks are filled in four beats of 64 bits each, with the critical double 
word loaded first. The data cache is blocked to internal accesses after a cache miss until the 
load from memory completes. On a cache miss, the critical double word read from memory 
is simultaneously written to the data cache and forwarded to the requesting unit, thus 
minimizing stalls due to cache fill latentcy. 

3.2.3 Data Cache Control 

The 603e provides several means of data cache control through the use of the WIMG bits 
in the page tables, control bits in the HIDO register, and user- and supervisor-level cache 
control instructions. While memory page level cache control is provided by the WIMG bits, 
the on-chip data cache can be invalidated, disabled, or locked by the three control bits in 
the HIDO register described in this section. (Note that, user- and supervisor-level are 
referred to as problem and privileged state, respectively, in the architecture specification.) 

3.2.3.1 Data Cache Invalidation 

While the data cache is automatically invalidated when the 603e is powered up and during 
a hard reset, assertion of the soft reset signal does not cause data cache invalidation. 
Software may invalidate the contents of the data cache using the data cache flash invalidate 
(DCFI) control bit in the HIDO register. Flash invalidation of the data cache is 
accomplished by setting and clearing the DCFI bit in two consecutive store operations. 
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5.2.3.2 Data Cache Disabling 

The data cache may be disabled through the use of the data cache enable (DCE) control bit 
in the HIDO register. When the data cache is in the disabled state, the cache tag state bits 
are ignored, and all accesses are propagated to the bus as single-beat transactions. The DCE 
bit is cleared on power-up, causing the data cache to be disabled. The setting of the DCE 
bit must be preceded by a sync instruction to prevent the cache from being enabled or 
disabled in the middle of a data access. 

Note that while snooping is not performed when the data cache is disabled, cache 
operations (caused by the dcbz, dcbf, dcbst, and dcbi instructions) are not affected by 
disabling the cache, causing potential coherency errors. An example of this would be a dcbf 
instruction that hits a modified cache block in the disabled cache, causing a copyback to 
memory of potentially stale data. 

Regardless of the state of HID0[DCE], load and store operations are assumed to be weakly 
ordered. Thus the LSU can perform load operations that occur later in the program ahead 
of store operations, even when the data cache is disabled. However, strongly ordered load 
and store operations can be enforced through the setting of the I bit (of the page WIMG bits) 
when address translation is enabled. Note that when address translation is disabled, the 
default WIMG bits cause the I bit to be cleared (accesses are assumed to be cacheable), and 
thus the accesses are weakly ordered. Refer to Section 3.5.2, “Caching-Inhibited Attribute 
(I),” for a description of the operation of the I bit and Section 5.2, “Real Addressing Mode,” 
for a description of the WIMG bits when address translation is disabled. 

5.2.3.3 Data Cache Locking 

The contents of the data cache may be locked through the use of the DLOCK control bit in 
the HIDO register. A locked data cache supplies data normally on^cache hit, but cache 
misses are treated as cache-inhibited accesses. The cache inhibited (Cl) signal is asserted if 
a cache access misses into a locked cache. The setting of the DLOCK bit in HIDO must be 
preceded by a sync instruction to prevent the data cache from being locked during a data 
access. 

3.2.4 Data Cache Touch Load Support 

Touch load operations allow an instruction stream to prefetch data from memory prior to a 
cache miss. The 603e supports touch load operations through a temporary cache block 
buffer located between the BIU and the data cache. The cache block buffer is essentially a 
floating cache block that is loaded by the BIU on a touch load operation, and is then read 
by a load instruction that requests that data. After a touch load completes on the bus, the 
BIU continues to compare the touch load address with subsequent load requests from the 
data cache. If the load address matches the touch load address in the BIU, the data is 
forwarded to the data cache from the touch load buffer, the read from memory is canceled, 
and the touch load address buffer is invalidated. 
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To avoid the storage of stale data in the touch load buffer, touch load requests that are 
mapped as write-through or caching-inhibited by the MMU are treated as no-ops by the 
BIU. Also, subsequent load instructions after a touch load that are mapped as write-through 
or caching-inhibited do not hit in the touch load buffer, and cause the touch load buffer to 
be invalidated on a matching address. 

While the 603e provides only a single cache block buffer, other PowerPC microprocessor 
implementations may provide buffering for more than one cache block. Programs written 
for other implementations may issue several debt or debtst instructions sequentially, 
reducing the performance if executed on the 603e. To improve performance in these 
situations, the NOOPTI bit (bit 31) in the HIDO register may be set. This causes the debt 
and debtst instructions to be treated as no-ops, cause no bus activity, and incur only one 
processor clock cycle of execution latentcy. The default state of the NOOPTI bit is cleared 
after a power-on reset operation, enabling the use of the debt and debtst instructions. 

3.3 Basic Data Cache Operations 

This section describes the three types of operations that can occur to the data cache, and 
how these operations are implemented in the 603e. 

3.3.1 Data Cache Fill 

A cache block is filled after a read miss or write miss (read-with-intent-to-modify) occurs 
in the cache. The cache block that corresponds to the missed address is updated by a burst 
transfer of the data from system memory. Note that if a read miss occurs in a system with 
multiple bus masters, and the data is modified in another cache, the modified data is first 
written to external memory before the cache fill occurs. 

3.3.2 Data Cache Cast-Out Operation 

The 603e uses an LRU replacement algorithm to determine which of the two possible cache 
locations should be used for a cache update on a cache miss. Adding a new block to the 
cache causes any modified data associated with the least recently used element to be written 
back, or cast out, to system memory to maintain memory coherence. 

3.3.3 Cache Block Push Operation 

When a cache block in the 603e is snooped and hit by another bus master and the data is 
modified, the cache block must be written to memory and made available to the snooping 
device. The cache block that is hit is said to be pushed out onto the bus. The 603e supports 
two kinds of push operations — normal push operations and enveloped high-priority push 
operations, which are described in Section 3.6.9, “Enveloped High-Priority Cache Block 
Push Operation.” 
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3.4 Data Cache Transactions on Bus 

The 603e transfers data to and from the data cache in single-beat transactions of two words, 
or in four-beat transactions of eight words which fill a cache block. 

3.4.1 Single-Beat Transactions 

Single-beat bus transactions can transfer from one to eight bytes to or from the 603e. 
Single-beat transactions can be caused by cache write-through accesses, caching-inhibited 
accesses (I bit of the WIMG bits for the page is set), or accesses when the cache is disabled 
(HID0[DCE] bit is cleared), and can be misaligned. 

3.4.2 Burst Transactions 

Burst transactions on the 603e always transfer eight w ords of data at a time, and are aligned 
to a double-word boundary. The 603e transfer burst (TEST) output signal indicates to the 
system whether the current transaction is a single-beat transaction or four-beat burst 
transfer. Burst transactions have an assumed address order. For cacheable read operations 
or cacheable, non-write-through write operations that miss the cache, the 603e presents the 
double-word aligned address associated with the load or store instruction that initiated the 
transaction. 

As shown in Figure 3-3, this quad word contains the address of the load or store that missed 
the cache. This minimizes latency by allowing the critical code or data to be forwarded to 
the processor before the rest of the block is filled. For all other burst operations, however, 
the entire block is transferred in order (oct-word aligned). Critical-double- word-first 
fetching on a cache miss applies to both the data and instruction cache. 

3.4.3 Access to Direct-Store Segments 

The 603e does not provide support for access to direct-store segments. Operations 
attempting to access a direct-store segment will invoke a DSI exception. For additional 
information about DSI exceptions, refer to Section 4.5.3, "DSI Exception (0x00300).” 
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603e Cache Address 
Bits (27... 28) 



00 


01 


1 0 


1 1 


A 


B 


C 


D 


If the address requested is in double word A, the address placed on the bus is that of double- 
word A, and the four data beats are ordered in the following manner: 


Beat 

0 


1 


2 


3 


A 


B 


C 


D 


If the address requested is in double word C, the address placed on the bus will be that of 
double-word C, and the four data beats are ordered in the following manner: 


Beat 

0 


1 


2 


3 


C 


D 


A 


B 



Figure 3-3. Double-Word Address Ordering — Critical Double Word First 

3.5 Memory Management/Cache Access Mode Bits — 
W, I, M, and G 

Some memory characteristics can be set on either a block or page basis by using the WIMG 
bits in the BAT registers or page table entry (PTE) respectively. The WIMG attributes 
control the following functionality: 

• Write-through (W bit) 

• Caching-inhibited (I bit) 

• Memory coherency (M bit) 

• Guarded memory (G bit) 

These bits allow both uniprocessor and multiprocessor system designs to exploit numerous 
system-level performance optimizations. 

Careless specification and use of these bits may create situations where coherency 
paradoxes are observed by the processor. In particular, this can happen when the state of 
these bits is changed without appropriate precautions being taken (for example, when 
flushing the pages that correspond to the changed bits from the caches of all processors in 
the system is required, or when the address translations of aliased physical addresses 
(referred to as real addresses in the architecture specification) specify different values for 
any of the WIM bits). The 603e considers either of these cases to be a programming error 
which may compromise the coherency of memory. These paradoxes can occur within a 
single processor or across several devices, as described in Section 3.6.4. 1, “Coherency in 
Single-Processor Systems.” 
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The WIMG attributes are programmed by the operating system for each page and block. 
The W and I attributes control how the processor performing an access uses its own cache. 
The M attribute ensures that coherency is maintained for all copies of the addressed 
memory location. The G attribute prevents out-of-order loading and prefetching from the 
addressed memory location. 

When an access requires coherency, the processor performing the access must inform the 
coherency mechanisms throughout the system that the access requires memory coherency. 
The M attribute determines the kind of access performed on the bus (global or local). 

The WIMG attributes occupy four bits in the BAT registers for block address translation 
and in the PTEs for page address translation. The WIMG bits are programmed as follows: 

• The operating system uses the mtspr instruction to program the WIMG bits in the 
BAT registers for block address translation. The IB AT register pairs do not have a 
G bit and all accesses that use the IB AT register pairs are considered not guarded. 

• The operating system writes the WIMG bits for each page into the PTEs in system 
memory as it sets up the page tables. 

Note that for accesses performed with direct address translation (MSR[IR] = 0 or 
MSR[DR] = 0 for instruction or data access, respectively), the WIMG bits are 
automatically generated as ObOOll (the data is write-back, caching is enabled, memory 
coherency is enforced, and memory is guarded). 

3.5.1 Write-Through Attribute (W) 

When an access is designated as write-through (W = 1), if the data is in the cache, a store 
operation updates the cached copy of the data. In addition, the update is written to the 
external memory location (as described below). 

While the PowerPC architecture permits multiple store instructions to be combined for 
write-through accesses except when the store instructions are separated by a sync or eieio 
instruction, the 603e does not implement this “combined store” capability. Note that a store 
operation that uses the write-through attribute may cause any part of valid data in the cache 
to be written back to main memory. 

The definition of the external memory location to be written to in addition to the on-chip 
cache depends on the implementation of the memory system but can be illustrated by the 
following examples: 

• RAM^ — ^The store is sent to the RAM controller to be written into the target RAM. 

• I/O device — ^The store is sent to the memory-mapped I/O control hardware to be 
written to the target register or memory location. 

In systems with multilevel caching, the store must be written to at least a depth in the 
memory hierarchy that is seen by all processors and devices. 
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Accesses that correspond to W = 0 are considered write-back. For this case, although the 
store operation is performed to the cache, it is only made to external memory when a copy- 
back operation is required. Use of the write-back mode (W = 0) can improve overall 
performance for areas of the memory space that are seldom referenced by other masters in 
the system. 

3.5.2 Caching-Inhibited Attribute (I) 

If I = 1, the memory access is completed by referencing the location in main memory, 
bypassing the on-chip cache. During the access, the addressed location is not loaded into 
the cache nor is the location allocated in the cache. It is considered a programming error if 
a copy of the target location of an access to caching-inhibited memory is resident in the 
cache. Software must ensure that the location has not been previously loaded into the cache, 
or, if it has, that it has been flushed from the cache. 

The PowerPC architecture permits data accesses from more than one instruction to be 
combined for cache-inhibited operations, except when the accesses are separated by a sync 
instruction, or by an eieio instruction when the page or block is also designated as guarded. 
This “combined access” capability is not implemented on the 603e. Note that the eieio is 
treated as a no-op by the 603e. 

The caching-inhibited (I) bit in the 603e controls whether load and store operations are 
strongly or weakly ordered. If an I/O device requires load and store accesses to occur in 
program order, then the I bit for the page must be set. 

3.5.3 Memory Coherency Attribute (M) 

This attribute is provided to allow improved performance in systems where hardware- 
enforced coherency is relatively slow, and software is able to enforce the required 
coherency. When M = 0, the processor does not enforce data coherency. When M = 1, the 
processor enforces data coherency and the corresponding access is considered to be a 
global access. 

When the M attribute is set, and the access is performed, the global signal is asserted to 
indicate that the access is global. Snooping devices affected by the a ccess must then 
respond to this global access if their data is modified by asserting ARTRY, and updating the 
memory location. 

Because instruction memory does not have to be consistent with data memory, the 603e 
ignores the M attribute for instruction accesses. 
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3.5.4 Guarded Attribute (G) 

When the guarded bit is set, the memory area (block or page) is designated as guarded, 
meaning that the processor will perform out-of-order accesses to this area of memory, only 
as follows: 

• Out-of-order load operations from guarded memory areas are performed only if the 
corresponding data is resident in the cache. 

• The processor prefetches from guarded areas, but only when required, and only 
within the memory boundary dictated by the cache block. That is, if an instruction 
is certain to be required for execution by the program, it is fetched and the remaining 
instructions in the block may be prefetched, even if the area is guarded. 

This setting can be used to protect certain memory areas from read accesses made by the 
processor that are not dictated directly by the program. If there are areas of memory that are 
not fully populated (in other words, there are holes in the memory map within this area), 
this setting can protect the system from undesired accesses caused by out-of-order load 
operations or instruction prefetches that could lead to the generation of the machine check 
exception. Also, the guarded bit can be used to prevent out-of-order load operations or 
prefetches from occurring to certain peripheral devices that produce undesired results when 
accessed in this way. 

3.5.5 W, I, and M Bit Combinations 

Table 3-1 summarizes the six combinations of the WIM bits. Note that either a zero or one 
setting for the G bit is allowed for each of these WIM bit combinations. 



Table 3-1. Combinations of W, I, and M Bits 



WIM Setting 


Meaning 


000 


Data may be cached. 

Loads or stores whose target hits in the cache use that entry in the cache. 
Memory coherency is not enforced by hardware. 


001 


Data may be cached. 

Loads or stores whose target hits in the cache use that entry in the cache. 
Memory coherency is enforced by hardware. 


010 


Caching is inhibited. 

The access is performed to external memory, completely bypassing the cache. 
Memory coherency is not enforced by hardware. 


oil 


Caching is inhibited. 

The access is performed to external memory, completely bypassing the cache. 

Memory coherency must be enforced by external hardware (processor provides hardware 
indication that access is global). 
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Table 3-1. Combinations of W, I, and M Bits (Continued) 



WIM Setting 


Meaning 


100 


Data may be cached. 

Load operations whose target hits in the cache use that entry in the cache. 

Stores are written to external memory. The target location of the store may be cached and is 
updated on a hit. 

Memory coherency is not enforced by hardware. 


101 


Data may be cached. 

Load operations whose target hits in the cache use that entry in the cache. 

Stores are written to external memory. The target location of the store may be cached and is 
updated on a hit. 

Memory coherency is enforced by hardware. 



3.5.5. 1 Out-of-Order Execution and Guarded Memory 

Out-of-order execution occurs when the 603e performs operations in advance in case the 
result is needed. Typically, these operations are performed by otherwise idle resources; thus 
if a result is not required, it is ignored and the out-of-order operation incurs no time penalty 
(typically). 

Supervisor-level programs designate memory as guarded on a block or page level. Memory 
is designated as guarded if it may not be “well-behaved” with respect to out-of-order 
operations. 

For example, the memory area that contains a memory-mapped I/O device may be 
designated as guarded if an out-of-order load or instruction fetch performed to such a 
device might cause the device to perform unexpected or incorrect operations. Another 
example of memory that should be designated as guarded is the area that corresponds to the 
device that resides at the highest implemented physical address (as it has no successor and 
out-of-order sequential operations such as instruction prefetching may result in a machine 
check exception). In addition, areas that contain holes in the physical memory space may 
be designated as guarded. 

5.5.5.2 Effects of Out-of-Order Data Accesses 

Most data operations may be performed out-of-order, as long as the machine appears to 
follow a simple sequential model. However, the following out-of-order operations do not 
occur: 



• Out-of-order loading from guarded memory (G = 1) does not occur. However, when 
a load or store operation is required by the program, the entire cache block(s) 
containing the referenced data may be loaded into the cache. 

• Out-of-order store operations that alter the state of the target location do not occur. 

• No errors except machine check exceptions are reported due to the out-of-order 
execution of an instruction until it is known that execution of the instruction is 
required. 
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Machine check exceptions resulting solely from out-of-order execution (from nonguarded 
memory) may be reported. When an out-of-order instruction's result is abandoned, only one 
side effect (other than a possible machine check) may occur— the referenced bit (R) in the 
corresponding page table entry (and TLB entry) can be set due to an out-of-order load 
operation. See Chapter 4, “Exceptions,” for riiore information on the machine check 
exception. 

Thus an out-of-order load or store instruction will not access guarded memory unless one 
of the following conditions exist: 

• The target memory item is resident in an on-chip cache. In this case, the location 
may be accessed from the cache or main memory. 

• The target memory item is cacheable (I = 0) and it is guaranteed that the load or store 
is in the execution path (assuming there are no intervening exceptions). In this case, 
the entire cache block containing the target may be loaded into the cache. 

• The target memory is cache-inhibited (1=1), the load or store instruction is in the 
execution path, and it is guaranteed that no prior instructions can cause an exception. 

3.5.S.3 Effects of Out-of-Order Instruction Fetches 

To avoid instruction fetch delay, the processor typically fetches instructions ahead of those 
currently being executed. Such instruction prefetching is said to be out-of-order in that 
prefetched instructions may not be executed due to intervening branches or exceptions. 

During instruction prefetching, no errors except machine check exceptions are reported due 
to the out-of-order fetching of an instruction until it is known that execution of the 
instruction is required. 

Machine check exceptions resulting solely from out-of-order execution (from nonguarded 
memory) may be reported. When an out-of-order instruction's result is abandoned, only one 
side effect (other than a possible machine check) may occur — the referenced bit (R) in the 
corresponding page table entry (and TLB entry) can be set due to an out-of-order load 
operation. See Chapter 4, “Exceptions,” for more information on the machine check 
exception. 

Instruction fetching from guarded memory is not permitted. 

3.6 Cache Coherency— MEI Protocol 

The primary objective of a coherent memory system is to provide the same image of 
memory to all devices using the system. Coherency allows synchronization and 
cooperative use of shared resources. Otherwise, multiple copies of a memory location, 
some containing stale values, could exist in a system resulting in errors when the stale 
values are used. Each potential bus master must follow rules for managing the state of its 
cache. 



3-14 



PowerPC 603e RISC Microprocessor User's Manual 




The 603e cache coherency protocol is a coherent subset of the standard MESI four-state 
cache protocol that omits the shared state. Since data cannot be shared, the 603e signals all 
cache block fills as if they were write misses (read-with-intent-to-modify), which flushes 
the corresponding copies of the data in all caches external to the 603e prior to the 603e’s 
cache block fill operation. Following the cache block load, the 603e is the exclusive owner 
of the data and may write to it without a bus broadcast transaction. 

To maintain this coherency, all global reads observed on the bus by the 603e are snooped 
as if they were writes, causing the 603e to write a modified cache block back to memory 
and invalidate the cache block, or simply invalidate the cache block if it is unmodified. The 
exception to this rule occurs when a snooped transaction is a caching-inhibited read (either 
burst or single-beat, where TT0-TT4 = XI 010; see Table 7-1 for clarification), in which 
case the 603e does not invalidate the snooped cache block. If the cache block is modified, 
the block is written back to memory, and the cache block is marked exclusive unmodified. 
If the cache block is marked exclusive unmodified when snooped, no bus action is taken, 
and the cache block remains in the exclusive unmodified state. This treatment of caching- 
inhibited reads decreases the possibility of data thrashing by allowing noncaching devices 
to read data without invalidating the entry from the 603e’s data cache. 

3.6.1 MEI State Definitions 

The 603e’s data cache characterizes each 32-byte block it contains as being in one of three 
MEI states. Addresses presented to the cache are indexed into the cache directory with bits 
A20-A26, and the upper-order 20 bits from the physical address translation (PA0-PA19) 
are compared against the indexed cache directory tags. If neither of the indexed tags 
matches, the result is a cache miss. If a tag matches, a cache hit occurred and the directory 
indicates the state of the cache block through two state bits kept with the tag. The three 
possible states for a cache block in the cache are the modified state (M), the exclusive state 
(E), and the invalid state (I). The three MEI states are defined in Table 3-2. 



Table 3-2. MEI State Definitions 



MEI State 


Definition 


Modified (M) 


The addressed cache block is valid in the cache and only in the cache. The cache block is modified 
with respect to system memory— that is, the modified data in the cache block has not been written 
back to memory. 


Exclusive (E) 


The addressed block is in this cache only. The data in this cache block is consistent with system 
memory. 


Invalid (I) 


This state indicates that the addressed cache block is not resident in the cache. 



3.6.2 MEI State Diagram 

The 603e provides dedicated hardware to provide memory coherency by snooping bus 
transactions. The address retry capability of the 603e enforces the MEI protocol, as shown 
in Figure 3-4. Figure 3-4 assumes that the WIM bits for the page or block are set to 001; 
that is, write-back, caching-not-inhibited, and memory coherency enforced. 
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Section 3.10, “MEI State Transactions,” provides a detailed list of MEI transitions for 
various operations and WIM bit settings. 




SH =Snoop Hit 
RH =Read Hit 
RM =Read Miss 
WH=WriteHit 
WM=Write Miss 
SH/CRW=Snoop Hit, Cacheable Read/Write 
SH/CIR=Snoop Hit, Cache Inhibited Read 

Figure 3-4. MEI Cache Coherency Protocol — State Diagram (WIM = 001) 

3.6.3 MEI Hardware Considerations 

While the 603e provides the hardware required to monitor bus traffic for coherency, the 
603e data cache tags are single ported, and a simultaneous load or store and snoop access 
represent a resource conflict. In general, the snoop access has highest priority and is given 
first access to the tags. The load or store access will then occur on the clock following the 
snoop. The snoop is not given priority into the tags when the snoop coincides with a tag 
write (for example, validation after a cache block load). In these situations, the snoop is 
retried and must re-arbitrate before the lookup is possible. 

Occasionally, cache snoops cannot be serviced and must be retried. These retries occur if 
the cache is busy with a burst read or write when the snoop operation takes place. 



= Snoop Push 
(J) = Cache Line Fill 
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Note that it is possible for a snoop to hit a modified cache block that is already in the process 
of being written to the copyback buffer for replacement purposes. If this happens, the 603e 
retries the snoop, and raises the priority of the cast-out operation to allow it to go to the bus 
before the cache block fill. 

The global (GBL) signal, asserted as part of the address attribute field during a bus 
transaction, enables the snooping hardware of the 603e. Address bus masters assert GBL to 
indicate that the current tran sactio n is a global access (that is, an access to memory shared 
by more than one device). If GBL i s not a sserted for the transaction, that transaction is not 
snoo ped b y the 603e. Note that the GBL signal is not asserted for instruction fetches, and 
that GBL is asserted for all data read or write operations when using direct address 
translation. (Note that direct address translation is referred to as the real addressing mode, 
not the direct-store segment, in the architecture specification.) 

Normally, GBL reflects the M-bit value specified for the memory reference in the 
corresponding translation descriptor(s). Care must be taken to minimize the number of 
pages marked as global, because the retry protocol enforces coherency and can use 
considerable bus bandwidth if much data is shared. Therefore, available bus bandwidth can 
decrease as more traffic is marked global. 

The 603e snoops a transaction if the transfer start (TS) and GBL signals are asserted 
together in the same bus clock (this is a qualified snooping condition). No snoop update to 
the 603e cache occurs if the snooped transaction is not marked global. Also, because cache 
block cast-outs and snoop pushes do not require snooping, the GBL signal is not asserted 
for these operations. 

When the 603e detects a qualified snoop condition, the address associated with the TS 
signal is compared with the cache tags. Snooping finishes if no hit is detected. If, however, 
the address hits in the cache, the 603e reacts according to the MEI protocol shown in 
Figure 3-4. 

To facilitate external monitoring of the internal cache tags, the cache set entry signals 
(CSEO-CSEl) represent in binary the cache set being replaced on read operations 
(including read-with-intent-to-modify operations). The CSEO-CSEl signals do not apply 
for write operations to memory, or during non-cacheable or touch load operations. Note that 
these signals are valid only for 603e burst operations. Table 3-3 shows the CSEO-CSEl 
(cache set entry) encodings. 

Table 3-3. CSEO-CSEl Signal Encoding 



CSEO-CSEl 


Cache Set Element 


00 


SetO 


01 


Set 1 


10 


Set 2 


11 


Sets 



Chapter 3. Instruction and Data Cache Operation 



3-17 





3.6.4 Coherency Precautions 

The 603e supports a three-state coherency protocol that supports the modified, exclusive, 
and invalid (MEI) cache states. This protocol is a compatible subset of the MESI four-state 
protocol and operates coherently in systems that contain four-state caches. In addition, the 
603e does not broadcast cache operations caused by cache instructions. They are intended 
for the management of the local cache but not for other caches in the system. 

3.6.4.1 Coherency in Single-Processor Systems 

The following situations concerning coherency can be encountered within a single- 
processor system: 

• Load or store to a caching-inhibited page (WIM = ObXlX) and a cache hit occurs 

Caching is inhibited for this page (I = 1) — ^Load or store operations to a caching- 
inhibited page that hit in the cache cause boundedly undefined results. 

• Store to a page marked write-through (WIM = Obi OX) and a cache read hit to a 
modified cache block 

This page is marked as write-through (W = 1) — The 603e pushes the modified cache 
block to memory and marks the block exclusive (E). 

Note that when WIM bits are changed, it is critical that the cache contents should reflect the 
new WIM bit settings. For example, if a block or page that had allowed caching becomes 
caching-inhibited, software should ensure that the appropriate cache blocks are flushed to 
memory and invalidated. 

3.6.5 Load and Store Coherency Summary 

Table 3-4 provides a summary of memory coherency actions performed by the 603e on load 
operations. Noncacheable cases are not part of this table. 



Table 3-4. Memory Coherency Actions on Load Operations 



Cache State 


Bus Operation 


ARTRY 


Action 


M 


None 


Don’t care 


Read from cache 


E 


None 


Don’t care 


Read from cache 


I 


Read 


Negated 


Load data and mark E 


I 


Read 


Asserted 


Retry read operation 



Table 3-5 provides an overview of memory coherency actions on store operations. This 
table does not include noncacheable or write-through cases. The read-with-intent-to- 
modify (RWITM) examples involve selecting a replacement class and casting-out modified 
data that may have resided in that replacement class. 
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Table 3-5. Memory Coherency Actions on Store Operations 



Cache State 


Bus Operation 


ARTRY 


Action 


M 


None 


Don't care 


Modify cache 


E 


None 


Don't care 


Modify cache, mark M 


I 


RWITM 


Negated 


Load data, modify it, mark M 


I 


RWITM 


Asserted 


Retry the RWITM 



3.6.6 Atomic Memory References 

The Load Word and Reserve Indexed (Iwarx) and Store Word Conditional Indexed 
(stwcx.) instructions provide an atomic update function for a single, aligned word of 
memory. While an Iwarx instruction will normally be paired with an stwcx. instruction 
with the same effective address, an stwcx. instruction to any address will cancel the 
reservation. For detailed information on these instructions, refer to Chapter 2, “PowerPC 
603e Microprocessor Programming Model,” in this book and Chapter 8, “Instruction Set,” 
in The Programming Environments Manual. 

3.6.7 Cache Reaction to Specific Bus Operations 

There are several bus transaction types defined for the 603e bus. The 603e must snoop these 
transactions and perform the appropriate action to maintain memory coherency as shown 
in Table 3-6. A processor may assert ARTRY for any bus transaction due to internal 
conflicts that prevent the appropriate snooping. The transactions in Table 3-6 correspond to 
the transfer type signals TT0-TT4, which are described in Section 7.2.4. 1, “Transfer Type 
(TT0-TT4).” 



Table 3-6. Response to Bus Transactions 



Snooped Transaction 


603e Response 


Clean block 


No action is taken. 


Flush block 


No action is taken. 


Write-with-flush 

Write-with-flush-atomic 


Write-with-flush and write-with-flush-atomic operations occur after the processor issues 

a store or stwcx. instruction, respectively. 

• If the addressed block is in the exclusive state, the address snoop forces the state of 
the addressed block to invalid. 

• If the addressed block is In the modified state, the address snoop causes ARTRY to 
be asserted and initiates a push of the modified block out of the cache and changes 
the state of the block to invalid. 

• The execution of an stwcx. instruction cancels the reservation associated with any 
address. 


Kill block 


The kill block operation is an address-only bus transaction initiated when a dcbz 
instruction is executed; when snooped by the 603e, the addressed cache block is 
invalidated if in the E state, or flushed to memory and invalidated if in the M state, and 
any associated reservation is canceled. 
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Table 3-6. Response to Bus Transactions (Continued) 



Snooped Transaction 


603e Response 


Write-with-kill 


In a write-with-kill operation, the processor snoops the cache for a copy of the 
addressed block. If one is found, an additional snoop action is initiated Internally and the 
cache block is forced to the 1 state, killing modified data that may have been in the block. 
Any reservation associated with the block is also cancelled. 


Read 

Read-atomic 


The read operation is used by most single-beat and burst read operations on the bus. Ail 
burst reads observed on the bus are snooped as if they were writes, causing the 
addressed cache block to be flushed. A read on the bus with the CaBL signal asserted 
causes the following responses: 

• If the addressed block in the cache is invalid, the 603e takes no action. 

• If the addressed block in the cache is in the exclusive state, the block is invalidated. 

• If f the addressed block in the cache is in the modified state, the block is flushed to 
memory and the block Is invalidated. 

• If the snooped transaction is a caching-inhibited read, and the block in the cache is in 
the exclusive state, the snoop causes no bus activity and the block remains in the 
exclusive state. If the block is in the cache in the modified state, the 603e initiates a 
push of the modified block out to memory and marks the cache block as exclusive. 

Read atomic operations appear on the bus in response to Iwarx instructions and 
generate the same snooping responses as read operations. 


Read-with-intent-to- 
modify (RWITM) 
RWITM-atomic 


A RWITM operation is issued to acquire exclusive use of a memory location for the 
purpose of modifying it. 

• If the addressed block is invalid, the 603e takes no action. 

• If the addressed block in the cache is in the exclusive state, the 603e initiates an 
additional snoop action to change the state of the cache block to invalid. 

• 1 f the addressed block In the cache is in the modified state, the block Is flushed to 
memory and the block is invalidated. 

The RWITM atomic operations appear on the bus in response to stwcx. instructions 
and are snooped like RWITM instructions. 


sync 


No action is taken. 


TLB invalidate 


No action is taken. 



3.6.8 Operations Causing ARTRY Assertion 

The following scenarios cause the 603e to assert the ARTRY signal: 

• Snoop hits to a block in the M state (flush or clean) 

This case is a normal snoop hit and will result in ARTRY being asserted if the 
snooped transac tion was a “flush” or “clean” request. If the snooped transaction was 
a “kill” request, ARTRY will not be asserted. 

• Snoop attempt during the last TA of a cache line fill 

In no-DRTRY mode, during the cycle that the last TA is asserted to the 603e on a 
cache line fill, the tag is being written to its new state by the 603e a nd is not 
accessib le. This will result in a collision being signaled by asserting ARTRY. With 
DRTRY enabled, the cache tags are inaccessible to a snoop operation one cycle after 
the last TA. 



3-20 



PowerPC 603e RISC Microprocessor User's Manual 



















• Snoop hit after the first TA of a burst load operation 

After the first TA of a burst load operation, the data tags are committed to being 
written; snoop operatio ns cannot be serviced until the load completes, thereby 
causing the assertion of ARTRY. 

• Snoop hits to line in the cast-out buffer 

The 603e's cast-out buffer is kept coherent with main me mory, an d snoop operations 
that hit in the cast-out buffer will cause the assertion of ARTRY. 

• Snoop attempt during cycles that dcbz instruction or load or store operation is 
updating the tag 

During the execution of a dcbz instruction or during a load or store operation that 
requires a cache line cast-out, the cache tags will be inaccessible during the first and 
last cycle of the operation. 

• Snoop attempt during the cycle that a dcbf or dcbst instruction is updating the tag 

If the EA of a dcbf or dcbst instruction hits in the cache, the tag will be changed to 
its new state. During that clock, the tag is n ot access ible and snoop transactions 
during that cycle will cause the assertion of ARTRY. 

3.6.9 Enveloped High-Priority Cache Block Push Operation 

In cases where the 603e has completed the address tenure of a read operation, and then 
detects a snoop hit to a modified cache block by another bus master, the 603e provides a 
high-priority push operation. If the address snooped is the same as the address of the data 
to be returned by the read operation, ARTRY is asserted one or more times until the data 
tenure of the read operation is completed. The cache block push transaction can be 
enveloped within the address and data tenures of a read operation. This feature prevents 
deadlocks in system organizations that support multiple memory-mapped buses. 

More specifically, the 603e internally detects the scenario where a load request is 
outstanding and the processor has pipelined a write operation on top of the load. Normally, 
when the data bus is granted to the 603e, the resulting data bus tenure is used for the load 
operation. The enveloped high-priority cache block push feature defines a bus signal, the 
data bus write only qualifier (DBWO), which when asserted with a qualified data bus grant 
indicates that the resulting data tenure should be used for the store operation instead. This 
signal is described in Section 8.10, “Using Data Bus Write Only.” Note that the enveloped 
copyback operation is an internally pipelined bus operation. 
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3.7 Cache Control Instructions 

Software must use the appropriate cache management instructions to ensure that caches are 
kept consistent when data is modified by the processor. When a processor alters a memory 
location that may be contained in an instruction cache, software must ensure that updates 
to memory are visible to the instruction fetching mechanism. Although the instructions to 
enforce coherency vary among implementations and hence operating systems should 
provide a system service for this function, the following sequence is typical: 

1. dcbst (update memory) 

2. sync (wait for update) 

3. icbi (invalidate copy in cache) 

4. isync (invalidate copy in own instruction buffer) 

These operations are necessary because the processor does not maintain instruction 
memory coherent with data memory. Software is responsible for enforcing coherency of 
instruction caches and data memory. Since instruction fetching may bypass the data cache, 
changes made to items in the data cache may not be reflected in memory until after the 
instruction fetch completes. 

The PowerPC architecture defines instructions for controlling both the instruction and data 
caches when they exist. The 603e interprets the cache control instructions (icbi, dcbi, debt, 
debz, dcbst) as if they pertain only to the 603e’s caches. They are not intended for use in 
managing other caches in the system. 

The debz instruction causes an address-only broadcast on the bus if the contents of the 
block are from a page marked global through the WIMG bits. This broadcast is performed 
for coherency reasons; the debz instruction is the only cache control instruction that can 
allocate and take new ownership of a line. The other instructions do not broadcast either for 
the purpose of invalidating or flushing other caches in the system or for managing system 
resources. Any bus activity caused by these instructions is the direct result of performing 
the operation in the 603e cache. Note that a data access exception is generated if the 
effective address of a dcbi, dcbst, debf, or debz instruction cannot be translated due to the 
lack of a TLB entry. Note that exceptions are referred to as interrupts in the architecture 
specification. 

Note that in the PowerPC architecture, the term cache block, or simply block when used in 
the context of cache implementations, refers to the unit of memory at which coherency is 
maintained. For the 603e this is the eight-word cache line. This value may be different for 
other PowerPC implementations. In-depth descriptions of coding these instructions is 
provided in Chapter 3, “Addressing Modes and Instruction Set Summary,” and Chapter 10, 
“Instruction Set,” in The Programming Environments Manual, 
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3.7.1 Data Cache Block Touch (debt) Instruction 

This instruction provides a method for improving performance through the use of software- 
initiated prefetch hints. The 603e performs the fetch for the cases when the address hits in 
the TLB or the BAT registers, and when it is a permitted load access from the addressed 
page. The operation is treated similarly to a byte load operation with respect to coherency. 

If the address translation does not hit in the TLB or BAT mechanism, or if it does not have 
load access permission, the instruction is treated as a no-op. 

If the cache is locked or disabled, or if the access is to a page that is marked as guarded, the 
debt instruction is treated as a no-op. 

If the access is directed to a write-through or caching-inhibited page, the instruction is 
treated as a no-op. 

The debt instruction never affects the referenced or changed bits in the hashed page table. 

A successful debt instruction affects the state of the TLB and cache LRU bits as defined 
by the LRU algorithm. 

The touch load buffer will be marked invalid if the contents of the touch buffer have been 
moved to the cache, if any data cache management instruction has been executed, if a debz 
instruction is executed that matches the address of the cache block in the touch buffer, or if 
another debt instruction is executed. 

3.7.2 Data Cache Block Touch for Store (debtst) Instruction 

The debtst instruction, like the data cache block touch instruction (debt)^ allows software 
to prefetch a cache block in anticipation of a store operation (read with intent to modify). 

3.7.3 Data Cache Block Clear to Zero (debz) Instruction 

If the block containing the byte addressed by the EA is in the data cache, all bytes are 
cleared. 

If the block containing the byte addressed by the EA is not in the data cache and the 
corresponding page is caching-allowed, the block is established in the data cache without 
fetching the block from main memory, and all bytes of the block are cleared. If the contents 
of the cache block are from a page marked global through the WIM bits, an address-only 
bus transaction is run. 

If the page containing the byte addressed by the EA is caching-inhibited or write-through, 
then the system alignment exception handler is invoked. 

The debz instruction is treated as a store to the addressed byte with respect to address 
translation and protection. 
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3.7.4 Data Cache Block Store (dcbst) Instruction 

If the block containing the byte addressed by the EA is in coherence required mode, and a 
block containing the byte addressed by the EA is in the data cache of any processor and has 
been modified, the writing of it to main memory is initiated. 

The function of this instruction is independent of the write-through and caching- 
inhibited/caching-allowed modes of the block containing the byte addressed by the EA. 

This instruction is treated as a load to the addressed byte with respect to address translation 
and protection. 

3.7.5 Data Cache Block Flush (debt) Instruction 

The action taken depends on the memory mode associated with the target, and on the state 
of the cache block. The list below describes the action taken for the various cases. The 
actions described are executed regardless of whether the page containing the addressed 
byte is in caching-inhibited or caching-allowed mode. The following actions occur in both 
coherency-required mode (WIM = ObXXl) and coherency-not-required mode (WIM = 
ObXXO). 

The debf instruction causes the following cache activity: 

• Unmodified block— Invalidates the block in the processor’s cache. 

• Modified block — Copies the block to memory and invalidates data cache block. 

• Absent block — Does nothing. 

The 603e treats this instruction as a load to the addressed byte with respect to address 
translation and protection. 

3.7.6 Enforce In-Order Execution of I/O Instruction (eieio) 

As defined by the PowerPC architecture, the eieio instruction provides an ordering function 
for the effects of load and store instructions executed by a given processor. Executing eieio 
ensures that all memory accesses previously initiated by the given processor are completed 
with respect to main memory before any memory accesses subsequently initiated by the 
processor access main memory. The eieio instruction orders loads and stores to caching- 
inhibited memory only. 

The eieio instruction is intended for use only in performing memory-mapped I/G 
operations. It enforces “strong” ordering of cache-inhibited memory accesses during I/O 
operations between the processor and I/O devices. 

When executed by the 603e, the eieio instruction is treated as a no-op; caching-inhibited 
load and store operations (inhibited by the WIMG bits for the page) are performed in strict 
program order. 
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3.7.7 Instruction Cache Block Invalidate (icbi) Instruction 

The execution of an icbi instruction causes all four cache sets indexed by the EA to be 
marked invalid. No cache hit is required, and no MMU translation is performed. 

3.7.8 Instruction Synchronize (isync) Instruction 

The isync instruction waits for all previous instructions to complete and then discards any 
previously fetched instructions, causing subsequent instructions to be fetched (or 
refetched) from memory and to execute in the context established by the previous 
instructions. This instruction has no effect on other processors or on their caches. 

3.8 Bus Operations Caused by Cache Control 
instructions 

Table 3-7 provides an overview of the bus operations initiated by cache control 
instructions. The cache control, TLB management, and synchronization instructions 
supported by the 603e may affect or be affected by the operation of the bus. None of the 
instructions will actively broadcast through address-only transactions on the bus (except for 
dcbz), and no broadcasts by other masters are snooped by the 603e (except for kills). The 
operation of the instructions, however, may indirectly cause bus transactions to be 
performed, or their completion may be linked to the bus. The following table summarizes 
how these instructions may operate with respect to the bus. 

Note that Table 3-7 assumes that the WIM bits are set to 001; that is, since the cache is 
operating in write-back mode, caching is permitted and coherency is enforced. 
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Table 3-7. Bus Operations Caused by Cache Control Instructions (WIM = 001) 



Operation 


Cache State 


Next Cache State 


Bus Operations 


Comment 


sync 


Don’t care 


No change 


None 


Waits for memory queues 
to complete bus activity 


icbi 


Don’t care 


1 


None 


— 


dcbi 


Don’t care 


1 


None 


— 


dcbf 


1. E 


1 


None 


— 


debt 


M 


1 


Write with kill 


Block is pushed 


debst 


1. E 


No change 


None 




debst 


M 


E 


Write 


Block is pushed 


debz 


1 


M 


Write with kill 


— 


debz 


E, M 


M 


Kill block 


Writes over modified data 


debt 


1 


No change 


Read 


Fetched cache block is 
stored in touch load queue 


debt 


E, M 


No change 


None 


— 


debtst 


1 


No change 


Read-with-intent- 

to-modify 


Fetched cache block is 
stored in touch load queue 


debtst 

1 


E,M 


No change 


None 


— 



Table 3-7 does not include noncacheable or write-through cases, nor does it completely 
describe the mechanisms for the operations described. For more information, see 
Section 3.10, “MEI State Transactions.” 

For detailed information on the cache control instructions, refer to Chapter 2, “PowerPC 
603e Microprocessor Programming Model,” in this book and Chapter 8, “Instruction Set,” 
in The Programming Environments Manual. The 603e contains snooping logic to monitor 
the bus for these commands and the control logic required to keep the cache and the 
memory queues coherent. For additional details about the specific bus operations 
performed by the 603e, see Chapter 8, “System Interface Operation.” 

3.9 Bus Interface 

The bus interface buffers bus requests from the instruction and data caches, and executes 
the requests per the 603e bus protocol. It includes address register queues, prioritization 
logic, and bus control logic. The bus interface also captures snoop addresses for snooping 
in the cache and in the address register queues, snoops for reservations, and holds the touch 
load address for the cache. All data storage for the address register buffers (load and store 
data buffers) are located in the cache section. The data buffers are considered temporary 
storage for the cache and not part of the bus interface. 
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The general functions and features of the bus interface are as follows: 

• Seven address register buffers that include the following: 

— Instruction cache load address buffer 

— Data cache load address buffer 

— Data cache touch load address buffer (associated data block buffer located in 
cache) 

— Data cache castout/store address buffer (associated data line buffer located in 
cache) 

— Data cache snoop copyback address buffer (associated data line buffer located in 
cache) 

— Reservation address buffer for snoop monitoring 

• Pipeline collision detection for data cache buffers 

• Reservation address snooping for Iwarx/stwcx. instructions 

• One-level address pipelining 

• Load ahead of store capability 

A conceptual block diagram of the bus interface is shown in Figure 3-5. The address 
register queues in the figure hold transaction requests that the bus interface may issue on 
the bus independently of the other requests. The bus interface may have up to two 
transactions operating on the bus at any given time through the use of address pipelining. 




Figure 3-5. Bus Interface Address Buffers 



For additional information about the 603e bus interface and the bus protocols, refer to 
Chapter 8, “System Interface Operation.” 
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3.10 MEI State Transactions 



Table 3-8 shows MEI state transitions for various operations. Bus operations are described 
in Table 3-6. 



Operation 


Cache 

Operation 


Load 
(T = 0) 


Read 


Load 
(T = 0) 


Read 


Load (T = 0) 


Read 


Load (T = 0) 


Read 


Load (T = 0) 


Read 


Iwarx 


Read 


store 
(T = 0) 


Write 


Store 
(T = 0) 


Write 


store * stwcx. 
(T = 0) 


Write 


Store ^ stwcx. 
(T = 0) 


Write 


store * stwcx. 
(T = 0) 


Write 


Store (T = 0) 
or stwcx. 
(WIM = lOx) 


Write 


Store (T = 0) 
or stwcx. 
(WIM = 10x) 


Write 



Table 3-8. MEI State Transitions 



Bus 



WIM 






Current Next 

State State 



Cache Actions 



Bus 

Operation 




No xOx E,M 



No xlx 



No xlx 



No xlx M 




Same 1 Cast out of modified Write-with-kill 
block (as required) 

2 Pass four-beat read Read 
to memory queue 



Same Read data from cache 



Same Pass single-beat read Read 
to memory queue 



CRTRY read 



CRTRY read (push 
sector to write queue) 



Write-with-kill 



Acts like other reads but bus operation uses special encoding 



No OOx 



No OOx E,M 



No lOx 








Same 1 Cast out of modified Write-with-kill 
block (if necessary) 

2 PassRWITMto 
memory queue 



Write data to cache 



Same Pass single-beat write Wrlte-with- 

to memory queue flush 



Same 1 Write data to cache 




Write-with- 

flush 



2 Pass single-beat 
write to memory 
queue 



1 CRTRY write 



2 Push block to write Write-with-kill 
queue 



Same Pass single-beat write Write-with- 



to memory queue 



CRTRY write 
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Table 3-8. MEI State Transitions (Continued) 









Cache 

Operation 


Bus 

sync 


WIM 


Write 


No 


x1x 


Conditional 


If the reserved 


write 


uses a special 


Data cache 
block flush 


No 


XXX 


Data cache 
block flush 


No 


XXX 


Data cache 
block store 


No 


XXX 


Data cache 
block store 


No 


XXX 


Data cache 
block set to 
zero 


No 


x1x 


Data cache 
block set to 
zero 


No 


10x 


Data cache 
1 block set to 
zero 

i 


Yes 


OOx 


Data cache 
block set to 
zero 


No 


OOx 


Data cache 
block touch 


No 


x1x 


Data cache 
block touch 


No 


x1x 


Data cache 
block touch 


No 


x1x 



Current Next 

State State 




Cache Actions 


Bus 

Operation 


1 CRTRY write 


— 


2 Push block to write 
queue 


Write-with-kill 






Same 1 CRTRY debt 
2 Pass flush 



3 State change only 



Push block to write 
queue 



Same 1 CRTRY debst 
2 Pass clean 



Same 3 No action 



Push block to write 
queue 



Alignment trap 



Write-with-kill 



Write-with-kill 











Alignment trap 



Same 1 CRTRY debz 



2 Cast out of modified Write-with-kill 
block 



3 Pass kill 



4 Clear block 



Clear block 



Same Pass single-beat read Read 
to memory queue 



CRTRY read 



1 CRTRY read 



2 Push block to write Write-with-kill 
queue 
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Table 3-8. MEI State Transitions (Continued) 



Operation 


Caehe 

Operation 


Bus 

sync 


WIM 


Current 

State 


Next 

State 


Cache Actions 


Bus 

Operation 


debt 


Data cache 
block touch 


No 


xOx 


I 


Same 


1 Cast out of modified 
block (as required) 


Write-with-kill 


2 Pass four-beat read 
to memory queue 


Read 


debt 


Data cache 
block touch 


No 


xOx 


E,M 


Same 


No action 


— 


Single-beat 

read 


Reload 
dump 1 


No 


XXX 


I 


Same 


Forward datajn 


— 


Four-beat read 

(double-word- 

aligned) 


Reload 

dump 


No 


XXX 


I 


E 


Write datajn to cache 




Four-beat write 

(double-word- 

aligned) 


Reload 

dump 


No 


XXX 


I 


M 


Write datajn to cache 




E~>l 


Snoop 
write or kill 


No 


XXX 


E 


■ 


State change only 
(committed) 


— 


M — >1 


Snoop 

kill 


No 


XXX 


M 


■ 


State change only 
(committed) 


— 


Push 1 

M->l ' 


Snoop 

flush 


No 


XXX 


M 


■ 


Conditionally push 


Write-with-kill 


Push 

M-^E 


Snoop 

clean 


No 


XXX 


M 


E 


Conditionally push 


Write-with-kill 


tibie 


TLB 

invalidate 


No 


XXX 


X 


X 


1 CRTRYTLBI 




2 PassTLBI 


— 


3 No action 


— 


syne 


Synchroni- 

zation 


No 


XXX 


1 


1 


1 CRTRYsync 


— 


2 Pass sync 


— 


3 No action 


— 



Note that single-beat writes are not snooped in the write queue. 
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Chapter 4 
Exceptions 

The PowerPC exception mechanism allows the processor to change to supervisor state 
(referred to as privileged state in the architecture specification) as a result of external 
signals, errors, or unusual conditions arising in the execution of instructions. When 
exceptions (referred to as interrupts in the architecture specification) occur, information 
about the state of the processor is saved to certain registers and the processor begins 
execution at an address (exception vector) predetermined for each exception. Processing of 
exceptions occurs in supervisor mode. 

Although multiple exception conditions can map to a single exception vector, a more 
specific condition may be determined by examining a register associated with the 
exception — for example, the DSISR or the FPSCR. Additionally, certain exception 
conditions can be explicitly enabled or disabled by software. 

The PowerPC architecture requires that exceptions be handled in program order; therefore, 
although a particular implementation may recognize exception conditions out of order, they 
are handled strictly in order with respect to the instruction stream. When an instruction- 
caused exception is recognized, any unexecuted instructions that appear earlier in the 
instruction stream, including any that have not yet entered the execute state, are required to 
complete before the exception is taken. An instruction is said to have “completed” when 
the results of that instruction’s execution have been committed to the registers defined by 
the architecture (for example, the GPRs or FPRs, rather than rename buffers). If a single 
instruction encounters multiple exception conditions, those exceptions are taken and 
handled sequentially. Likewise, exceptions that are asynchronous are recognized when they 
occur, but are not handled until the next instruction to complete in program order 
successfully completes. Throughout this chapter, the term “next instruction” implies the 
next instruction to complete in program order. 

Note that exceptions can occur while an exception handler routine is executing, and 
multiple exceptions can become nested. It is up to the exception handler to save the states 
to allow control to ultimately return to the original excepting program. 
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In many cases, after an exception handler handles an exception, there is an attempt to 
execute the instruction that caused the exception. Instruction execution continues until the 
next exception condition is encountered. This method of recognizing and handling 
exception conditions sequentially guarantees that the machine state is recoverable and 
processing can resume without losing instruction results. 

Exception handlers should save the information stored in SRRO and SRRl soon after the 
exception is taken to prevent this information from being lost due to another exception 
being taken. The information should be saved before enabling any exception that is 
automatically disabled when an exception is taken. 

In this chapter, the following terminology is used to describe the various stages of exception 
processing: 

Recognition Exception recognition occurs when the condition that can cause an 

exception is identified by the processor. 

Taken An exception is said to be taken when control of instruction 

execution is passed to the exception handler; that is, the context is 
saved and the instruction at the appropriate vector offset is fetched 
and the exception handler routing is executed in supervisor mode. 

Handling Exception handling is performed by the software linked to the 

appropriate vector offset. Exception handling is performed at 
supervisor-level. 

4.1 Exception Classes 

The PowerPC architecture supports four types of exceptions: 

• Synchronous, precise— These are caused by instructions. All instruction-caused 
exceptions are handled precisely; that is, the machine state at the time the exception 
occurs is known and can be completely restored. This means that (excluding the 
system call exceptions) the address of the faulting instruction is provided to the 
exception handler and that neither the faulting instruction nor subsequent 
instructions in the code stream will complete before the exception is taken. Once the 
exception is processed, execution resumes at the address of the faulting instruction 
(or at an alternate address provided by the exception handler). When an exception is 
taken due to a trap or system call instruction, execution resumes at an address 
provided by the handler. 

• Synchronous, imprecise — The PowerPC architecture defines two imprecise 
floating-point exception modes, recoverable and nonrecoverable. Even though the 
PowerPC 603e microprocessor provides a means to enable the imprecise modes, it 
implements these modes identically to the precise mode (that is, floating-point 
enabled exceptions are always precise on the 603e). 
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• Asynchronous, maskable — ^The external, SMI, and decrementer interrupts are 
maskable asynchronous exceptions. When these exceptions occur, their handling is 
postponed until the next instruction, and any exceptions associated with that 
instruction, completes. If there are no instructions in the execution units, the 
exception is taken immediately upon determination of the correct restart address (for 
loading SRRO). 

• Asynchronous, nonmaskable — ^There are two nonmaskable asynchronous 
exceptions: system reset and the machine check exception. These exceptions may 
not be recoverable, or may provide a limited degree of recoverability. All exceptions 
report recoverability through the MSR[RI] bit. 

The 603e exception classes are shown in Table 4-1. 

Table 4-1. PowerPC 603e Microprocessor Exception Classifications 



Type 


Exception 


Asynchronous, nonmaskable 


Machine check 
System reset 


Asynchronous, maskable 


Machine check 
External interrupt 
Decrementer 

System management interrupt 


Synchronous, precise 


Instruction-caused exceptions 



Note that Table 4-1 includes no synchronous imprecise exceptions. While the PowerPC 
architecture supports imprecise handling of floating-point exceptions, the 603e implements 
these exception modes as precise exceptions. 

Although the PowerPC architecture specifies that the recognition of the machine check 
exception is nonmaskable, on the 603e the stimuli that cause this except ion are maskable . 
For example, the machi ne check ex ception is caused by the assertion of TEA, APE, DPE, 
or MCP. However, the MCP, APE, and DPE signals can be disab led b y bits 0, 2, and 3 
respectively in HIDO. Therefore, the machine check caused by TEA is the only truly 
nonmaskable machine check exception. 

The 603e’s exceptions, and conditions that cause them, are listed in Table 4-2. Exceptions 
that are specific to the 603e are indicated. 
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Table 4-2. Exceptions and Conditions 



Exception 


Vector Offset 


Type 


(hex) 


Reserved 


00000 


System reset 


00100 


Machine check 


00200 


DSI 


00300 


ISI 


00400 


External 

interrupt 


00500 


Alignment 


00600 



Causing Conditions 




A system reset is caused by the assertion of either 



A machine check is caused by the asser tion of the TEA signal during a data 
bus transaction, assertion of MCP, AP£, and dPE. 



The cause of a DSI exception can be determined by the bit settings in the 

DSISR, listed as follows: 

I Set if the translation of an attempted access is not found in the primary 
hash table entry group (HTEG), or in the rehashed secondary HTEG, or in 
the range of a DBAT register; othenvise cleared. 

4 Set if a memory access is not permitted by the page or DBAT protection 
mechanism; otherwise cleared. 

5 Set If memory access is attempted to a direct-store segment; otherwise 
cleared. 

6 Set for a store operation and cleared for a load operation. 

II Set if eciwx or ecowx is used and EAR[E] is cleared. 



An ISI exception Is caused when an instruction fetch cannot be performed for 

any of the following reasons: 

• The effective (logical) address cannot be translated. That is, there is a page 
fault for this portion of the translation, so an ISI exception must be taken to 
load the PTE (and possibly the page) into memory. 

• The fetch access is to a direct-store segment. 

• The fetch access is to a no-execute segment. 

• The fetch access is to guarded storage and MSR[IR] = 1 . 

• The fetch access violates memory protection. If the key bits (Ks and Kp) in 
the segment register and the PP bits in the PTE are set to prohibit read 
access, instructions cannot be fetched from this location. 



An external interrupt is caused when MSR[EE] = 1 and the InT signal is 
asserted. 



An alignment exception Is caused when the 603e cannot perform a memory 

access for any of the following reasons: 

• The operand of a floating-point load or store is not word-aligned. 

• The operand of Imw, stmw, Iwarx, or stwcx. is not word-aligned. 

• The operand of dcbz is in a page that is write-through or caching-inhibited 
for a virtual mode access. 

• The operand of an elementary, multiple or string load or store crosses a 
segment boundary with a change to the direct-store T bit. 

• A little-endian access is misaligned, or a multiple access is attempted with 
the little-endian bit set. 
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Table 4-2. Exceptions and Conditions (Continued) 



Exception 

Type 



Vector Offset 
(hex) 




Floating-point 

unavailable 


00800 


Decrementer 


00900 


Reserved 


OOAOO-OOBFF 


System call 


OOCOO 


Trace 


OODOO 


Floating-point 

assist 


OOEOO 


Reserved 


00E1O-00FFF 


Instruction 
translation miss 


01000 


Data load 
translation miss 


01100 


Data store 
translation miss 


01200 



Causing Conditions 



A program exception is caused by one of the following exception conditions, 
which correspond to bit settings in SRR1 and arise during execution of an 
instruction: 

• Floating-point enabled exception — A floating-point enabled exception 
condition is generated when the following condition is met: 

(MSR[FE0l I MSR[FE1]) & FPSCR[FEX] is 1. 

FPSCR[FEX] is set by the execution of a floating-point instruction that 
causes an enabled exception or by the execution of one of the “move to 
FPSCR” instructions that results in both an exception condition bit and its 
corresponding enable bit being set in the FPSCR. 

• Illegal instruction — ^An illegal instruction program exception is generated 
when execution of an instruction is attempted with an illegal opcode or 
illegal combination of opcode and extended opcode fields (including 
PowerPC instructions not Implemented In the 603e. These do not include 
those optional instructions treated as no-ops). 

• Privileged instruction — A privileged instruction type program exception is 
generated when the execution of a privileged instruction is attempted and 
the MSR register user privilege bit, MSR[PR], is set. In the 603e, this 
exception Is generated for mtspr or mfspr with an invalid SPR field if 
SPR[0] = 1 and MSR[PR] = 1. This may not be true for all PowerPC 
processors. 

• Trap — A trap type program exception is generated when any of the 
conditions specified in a trap Instruction is met. 



A floating-point unavailable exception is caused by an attempt to execute a 
floating-point instruction (including floating-point load, store, and move 
Instructions) when the floating-point available bit is disabled, (MSR[FP] = 0). 



The decrementer exception occurs when the most significant bit of the 
decrementer (DEC) register transitions from 0 to 1 . The decrementer 
exception must also be enabled with the MSR[EE] bit. 



A system call exception occurs when a System Call (sc) Instruction is 
executed. 



A trace exception is taken when MSR[SE] =1 or when the currently completing 
Instruction is a branch and MSR[BE] =1 . 



Not implemented in the 603e. 



An instruction translation miss exception is caused when an effective address 
for an instruction fetch cannot be translated by the ITLB. 



A data load translation miss exception is caused when an effective address for 
a data load operation cannot be translated by the DTLB. 



A data store translation miss exception is caused when an effective address 
for a data store operation cannot be translated by the DTLB; or when a DTLB 
hit occurs and the change bit in the PTE must be set due to a data store 
operation. 
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Table 4-2. Exceptions and Conditions (Continued) 



Exception 

Type 


Vector Offset 
(hex) 


Causing Conditions 


Instruction 

address 

breakpoint 


01300 


An Instruction address breakpoint exception occurs when the address (bits 0- 
29) in the lABR matches the next instruction to complete in the completion unit 
and the lABR enable bit (bit 30) is set. 


System 

management 

interrupt 


01400 


A system management interrupt is caused when MSR[EE] =1 and the SMi 
input signal is asserted. 


Reserved 


01500-02FFF 


— 



Exceptions are roughly prioritized by exception class, as follows: 

1. Nonmaskable, asynchronous exceptions have priority over all other exceptions^ — 
system reset and machine check exceptions (although the machine check exception 
condition can be disabled so the condition causes the processor to go directly into 
the checkstop state). These exceptions cannot be delayed, and do not wait for the 
completion of any precise exception handling. 

2. Synchronous, precise exceptions are caused by instructions and are taken in strict 
program order. 

3 . Maskable asynchronous exceptions (external interrupt and decrementer exceptions) 
are delayed until higher priority exceptions are taken. 

System reset and machine check exceptions may occur at any time and are not delayed even 
if an exception is being handled. As a result, state information for the interrupted exception 
may be lost; therefore, these exceptions are typically nonrecoverable. 

All other exceptions have lower priority than system reset and machine check exceptions, 
and the exception may not be taken immediately when it is recognized. 
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4.1.1 Exception Priorities 

The exceptions are listed in Table 4-3 in order of highest to lowest priority. 

Table 4-3. Exception Priorities 
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Table 4-3. Exception Priorities (Continued) 



Exception 

Category 



Priority 



Instruction 

dispatch/ 

execution 



10 



11 



Post- 

instruction 

execution 



Exception 


Cause 


lABR 


Instruction address breakpoint exception 


Program 


Program exception due to the following: 

• Illegal instruction 

• Privileged instruction 

• Trap 


System call 


System call exception 


Floating-point 

unavailable 


Floating-point unavailable exception 


Program 


Program exception due to a floating-point enabled exception 


Alignment 


Alignment exception due to the following: 

• Floating-point not word-aligned 

• Imw, stmw, Iwarx, or stwcx. not word-aligned 

• Little-endian access is misaligned 

• Multiple or string access with little-endian bit set 


Data access 


Data access exception due to a BAT page protection violation 


Data access 


Data access exception due to the following: 

• eciwx, ecowx, Iwarx, or stwcx. to direct-store segment 
(bit 5 of DSISR) 

• Crossing from memory segment to direct-store segment 
(bit 0 of DSISR) 

• Crossing from direct-store segment to memory segment 

• Any access to direct-store, SR[T] = 1 

• eciwx or ecowx with EAR[E] = 0 (bit 11 of DSISR) 


DTLB miss 


Data TLB miss exception due to: 

• Store miss 

• Load miss 


Alignment 


Alignment exception due to a dcbz to a write-through or 
caching-inhibited page 


Data access 


Data access exception due to TLB page protection violation 


DTLB miss 


Data TLB miss exception due to a change bit not set on a store 
operation 


Trace 


Trace exception due to the following: 

• MSR[SE1 = 1 

• MSR[BE] = 1 for branches 



Exception priorities are described in detail in “Exception Priorities,” in Chapter 6, 
“Exceptions,” in The Programming Environments Manual, 
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4.1.2 Summary of Front-End Exception Handling 

The following list of interrupt categories describes how the 603e handles exceptions up to 
the point of signaling the appropriate exception to occur. Note that a recoverable state is 
reached if the completed store queue is empty (drained, not canceled) and any instruction 
that is next in program order and has been signaled to complete has completed. If MSR[RI] 
is clear, the 603e is in a nonrecoverable state by default. Also, completion of an instruction 
is defined as performing all architectural register writes associated with that instruction, and 
then removing that instruction from the completion buffer queue. 

• Asynchronous nonmaskable nonrecoverable — (System reset caused by the assertion 
of either HRESET or internally during power-on reset (FOR)). These exceptions 
have highest priority and are taken immediately regardless of other pending 
exceptions or recoverability. A nonpredicted address is guaranteed. 

• Asynchronous maskable nonrecoverable — (Machine check). A machine check 
exception takes priority over any other pendi ng exceptio n except a nonrecoverable 
system reset caused by the assertion of either HRESET or internally during FOR. A 
machine check exception is taken immediately regardless of recoverability. A 
machine check exception can occur only if the machine check enable bit, 
MSR[ME], is set. If MSR[ME] is cleared, the processor goes directly into checks top 
state when a machine check exception condition occurs. A nonpredicted address is 
guaranteed. 

• Asynchronous nonmaskable recoverable — (System reset caused by the assertion of 
SRESET). This interrupt takes priority over any other pending exceptions except 
nonrecoverable exceptions listed above. This exception is taken immediately when 
a recoverable state is reached. 

• Asynchronous maskable recoverable — (System management interrupt, external 
interrupt, decrementer interrupt). Before handling this type of exception, the next 
instruction in program order must complete or except. If this action causes another 
type of exception, that exception is taken and the asynchronous maskable 
recoverable exception remains pending. Once an instruction can complete without 
causing an exception, further instruction completion is halted while the untaken 
exception remains pending. The exception is taken when a recoverable state is 
reached. 

• Instruction fetch- fITLB. ISI). When this type of exception is detected, dispatch is 
halted and the current instruction stream is allowed to drain. If completing any 
instructions in this stream causes an exception, that exception is taken and the 
instruction fetch exception is forgotten. Otherwise, as soon as the machine is empty 
and a recoverable state is reached, the instruction fetch exception is taken. 
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• Instruction dispatch/execution — (Program, DSI, alignment, emulation trap, system 
call, DTLB miss on load or store, lABR). This type of exception is determined at 
dispatch or execution of an instruction. The exception remains pending until all 
instructions in program order before the exception-causing instruction are 
completed. The exception is then taken without completing the exception-causing 
instruction. If any other exception condition is created in completing these previous 
instructions in the machine, that exception takes priority over the pending 
instruction dispatch/execution exception, which will then be forgotten. 

• Post-instruction execution — (Trace). This type of exception is generated following 
execution and completion of an instruction while a trace mode is enabled. If 
executing the instruction produces conditions for another type of interrupt, that 
exception is taken and the post-instruction execution exception is forgotten for that 
instruction. 

4.2 Exception Processing 

When an exception is taken, the processor uses the save/restore registers, SRRO and SRRl, 
to save the contents of the machine state register for user-level mode (referred to as problem 
mode in the architecture specification) and to identify where instruction execution should 
resume after the exception is handled. 

When an exception occurs, SRRO is set to point to the instruction at which instruction 
processing should resume when the exception handler returns control to the interrupted 
process. All instructions in the program flow preceding this one will have completed and 
no subsequent instruction will have completed. This may be the address of the instruction 
that caused the exception or the next one (as in the case of a system call exception). The 
instruction addressed can be determined from the exception type and status bits. This 
address is used to resume instruction processing in the interrupted process, typically when 
an rfi instruction is executed. The SRRO register is shown in Figure 4-1 . 



SRRO (holds EA for resuming program execution) 

0 31 



Figure 4-1. Machine Status Save/Restore Register 0 

The save/restore register l(SRRl) is used to save machine status (the contents of the MSR) 
on exceptions and to restore those values when rfi is executed. SRRl is shown in 
Figure 4-2. 



Exception-specific information and MSR bit values 

0 31 



Figure 4-2. Machine Status Save/Restore Register 1 
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Typically, when an exception occurs, bits 0-15 of SRRl are loaded with exception-specific 
information and bits 1^31 of MSR are placed into the corresponding bit positions of 
SRRl. The 603e loads SRRl with specific bits for handling machine check exceptions, as 
shown in Table 4-4. 



Table 4-4. SRR1 Bit Settings for Machine Check Exceptions 



Bits 


Name 


Description 


0 


MSR[0] 


Copy of MSR bit 0 


1-4 


— 


Reserved 


5-9 


MSR[5-9] 


Copy of MSR bits 5-9 


10-11 


— 


Reserved 


12 


MCP 


Machine check 


13 


TEA 


TEA error 


14 


DPE 


Data parity error 


15 


APE 


Address parity error 


16-31 


MSR[16-31] 


Copy of MSR bits16-31 



The 603e loads SRRl with specific bits for handling the three TLB miss exceptions, as 
shown in Table 4-5. 



Table 4-5. SRR1 Bit Settings for Software Table Search Operations 



Bits 


Name 


Description 


0-3 


CRFO 


Copy of condition register field 0 (CRO) 


4 


— 


Reserved 


5-9 


MSR[5-9] 


Copy of MSR bits 5-9 


10-11 


— 


Reserved 


12 


KEY 


TLB miss protection key 


13 


I/D 


Instruction/data TLB miss 

0 DTLB miss 

1 ITLB miss 


14 


WAY 


Bit 1 4 indicates which TLB associativity set should be replaced 

0 Set 0 

1 Setl 


15 


S/L 


Store/load protection instruction 

0 Load miss 

1 Store miss 


16-31 


MSR[16-31] 


Copy of MSR bits16-31 
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Note that in some implementations, every instruction fetch when MSR[IR] = 1 and every 
instruction execution requiring address translation when MSR[DR] = 1 may modify SRRl. 

The MSR is shown in Figure 4-3. When an exception occurs, MSR bits, as described in 
Table 4-6, are altered as determined by the exception. 



TGPR 
POW 



ill Reserved 



OOOOOOOO0OOOO 


■ 


■ 


IS 


B 


B 


B 




B 


B 


B 


B 


H 






DR 


0 0 


Q 


B 



12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 

Figure 4-3. Machine State Register (MSR) 



Table 4-6 shows the bit definitions for the MSR. Full function reserved bits are saved in 
SRRl when an exception occurs; partial function reserved bits are not saved. 



Table 4-6. MSR Bit Settings 



Blt(s) 


Name 


Description 


0 


— 


Reserved. Full function. 


1-4 


— 


Reserved. Partial function. 


5-9 


— 


Reserved. Full function. 


10-12 


— 


Reserved. Partial function. 


13 


POW 


Power management enable (603e-specific) 

0 Disables programmable power modes (normal operation mode). 

1 Enables programmable power modes (nap, doze, or sleep mode). 

This bit controls the programmable power modes only; it has no effect on dynamic power 
management (DPM). MSR[POW] may be altered with an mtmsr instruction only. Also, when 
altering the POW bit, software may alter only this bit In the MSR and no others. The mtmsr 
Instruction must be followed by a context-synchronizing Instruction. 

See Chapter 9, “Power Management,” for more Information. 


14 


TGPR 


Temporary GPR remapping (603e-specific) 

0 Normal operation 

1 TGPR mode. GPR0-GPR3 are remapped to TGPR0-TGPR3 for use by TLB miss 
routines. 

The contents of GPR0-GPR3 remain unchanged while MSRfTGPR] = 1. Attempts to use 
GPR4-GPR31 with MSR[TGPR] = 1 yield undefined results. Temporarily replacesTGPRO- 
TGPR3 with GPR0-GPR3 for use by TLB miss routines. When this bit is set, all instruction 
accesses to GPR0-GPR3 are mapped to TGPR0-TGPR3, respectively. The TGPR bit is set 
when either an instruction TLB miss, data read miss, or data write miss exception is taken. 
The TGPR bit is cleared by an rfl instruction. 


15 


ILE 


Exception little-endian mode. When an exception occurs, this bit is copied into MSRILE] to 
select the endian mode for the context established by the exception. 


16 


EE 


External interrupt enable 

0 The processor Ignores external interrupts, system management Interrupts, and 
decrementer interrupts. 

1 The processor is enabled to take an external interrupt, system management Interrupt, or 
decrementer Interrupt. 
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Table 4-6. MSR Bit Settings (Continued) 



Bit(s) 


Name 


Description 


17 


PR 


Privilege level 

0 The processor can execute both user- and supervisor-level instructions. 

1 The processor can only execute user-level instructions. 


18 


FP 


Floating-point available 

0 The processor prevents dispatch of floating-point instructions, including floating-point 
loads, stores, and moves. 

1 The processor can execute floating-point instructions, and can take floating-point 
enabled exception type program exceptions. 


19 


ME 


Machine check enable 

0 Machine check exceptions are disabled. 

1 Machine check exceptions are enabled. 


20 


FEO 


Floating-point exception mode 0 (see Table 4-7) 


21 


SE 


Single-step trace enable 

0 The processor executes instructions normally. 

1 The processor generates a trace exception upon the successful completion of the next 
instruction. 


22 


BE 


Branch trace enable 

0 The processor executes branch instructions normally. 

1 The processor generates a trace exception upon the successful completion of a branch 
Instruction. 


23 


FE1 


Floating-point exception mode 1 (see Table 4-7) 


24 


— 


Reserved. Full function. 


25 


IP 


Exception prefix. The setting of this bit specifies whether an exception vector offset is 
prepended with Fs or Os. In the following description, nnnnn is the offset of the exception. See 
Table 4-2. 

0 Exceptions are vectored to the physical address 0x000n_nnnn. 

1 Exceptions are vectored to the physical address OxFFFn_nnnn. 


26 


IR 


Instruction address translation 

0 Instruction address translation is disabled. 

1 Instruction address translation is enabled. 

For more Information see Chapter 5, “Memory Management.” 


27 


DR 


Data address translation 

0 Data address translation is disabled. 

1 Data address translation is enabled. 

For more information see Chapter 5, “Memory Management.” 


28-29 


— 


Reserved. Full function. 


30 


Rl 


Recoverable exception (for system reset and machine check exceptions) 

0 Exception is not recoverable. 

1 Exception is recoverable. 


31 


LE 


Little-endian mode enable 

0 The processor runs in big-endian mode. 

1 The processor runs in little-endian mode. 
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The IEEE floating-point exception mode bits (FEO and FEl) together define whether 
floating-point exceptions are handled precisely, imprecisely, or whether they are taken at 
all. The possible settings and default conditions for the 603e are shown in Table 4-7. For 
further details, see Chapter 6, “Exceptions,” of The Programming Environments Manual. 



Table 4-7. IEEE Floating-Point Exception Mode Bits 



FEO 


FEl 


Mode 


0 


0 


Floating-point exceptions disabled 


0 


1 


Floating-point imprecise nonrecoverable* 


1 


0 


Floating-point imprecise recoverable* 


1 


1 


Floating-point precise mode 



* Not implemented In the 603e 



MSR bits are guaranteed to be written to SRRl when the first instruction of the exception 
handler is encountered. 

4.2.1 Enabling and Disabling Exceptions 

When a condition exists that may cause an exception to be generated, it must be determined 
whether the exception is enabled for that condition. 

• IEEE floating-point enabled exceptions (a type of program exception) are ignored 
when both MSR[FE0] and MSR[FE1] are cleared. If either of these bits are set, all 
IEEE enabled floating-point exceptions are taken and cause a program exception. 

• Asynchronous, maskable exceptions (that is, the external, system management, and 
decrementer interrupts) are enabled by setting the MSR[EE] bit. When MSR[EE] = 
0, recognition of these exception conditions is delayed. MSR[EE] is cleared 
automatically when an exception is taken, to delay recognition of conditions causing 
those exceptions. 

• A machine check exception can occur only if the machine check enable bit, 
MSR[ME], is set. If MSR[ME] is cleared, the processor goes directly into checkstop 
state when a machine check exception condition occurs. Individual machine check 
exceptions can be enabled and disabled through bits in the HIDO register, which is 
described in Table 2-2. 

• System reset exceptions cannot be masked. 
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4.2.2 Steps for Exception Processing 

After it is determined that the exception can be taken (by confirming that any instruction- 
caused exceptions occurring earlier in the instruction stream have been handled, and by 
confirming that the exception is enabled for the exception condition), the processor does 
the following: 

1. The machine status save/restore register 0 (SRRO) is loaded with an instruction 
address that depends on the type of exception. See the individual exception 
description for details about how this register is used for specific exceptions. 

2. Bits 1-4 and 10-15 of SRRl are loaded with information specific to the exception 
type. 

3. Bits 5-9 and 16-3 1 of SRRl are loaded with a copy of the corresponding bits of the 
MSR. 

4. The MSR is set as described in Table 4-6. The new values take effect beginning with 
the fetching of the first instruction of the exception-handler routine located at the 
exception vector address. 

Note that MSR[IR] and MSR[DR] are cleared for all exception types; therefore, 
address translation is disabled for both instruction fetches and data accesses 
beginning with the first instruction of the exception-handler routine. 

5. Instruction fetch and execution resumes, using the new MSR value, at a location 
specific to the exception type. The location is determined by adding the exception's 
vector (see Table 4-2) to the base address determined by MSR[IP]. If IP is cleared, 
exceptions are vectored to the physical address 0x000n_nnnn. If IP is set, exceptions 
are vectored to the physical address 0x¥F¥n_nnnn. For a machine check exception 
that occurs when MSR[ME] = 0 (machine check exceptions are disabled), the 
processor enters the checkstop state (the machine stops executing instructions). See 
Section 4.5.2, “Machine Check Exception (0x00200).” 

4.2.3 Setting MSR[RI] 

The operating system should handle MSR[RI] as follows: 

• In the machine check and system reset exceptions — If SRR1[RI] is cleared, the 
exception is not recoverable. If it is set, the exception is recoverable with respect to 
the processor. 

• In each exception handler — When enough state information has been saved that a 
machine check or system reset exception can reconstruct the previous state, set 
MSR[RI]. 

• In each exception handler — Clear MSR[RI], set the SRRO and SRRl registers 
appropriately, and then execute rfi. 

• Note that the RI bit being set indicates that, with respect to the processor, enough 
processor state data is valid for the processor to continue, but it does not guarantee 
that the interrupted process can resume. 
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4.2.4 Returning from an Exception Handler 

The Return from Interrupt (rfi) instruction performs context synchronization by allowing 
previously issued instructions to complete before returning to the interrupted process. In 
general, execution of the rfi instruction ensures the following: 

• All previous instructions have completed to a point where they can no longer cause 
an exception. If a previous instruction causes a direct-store interface error exception, 
the results must be determined before this instruction is executed. 

• Previous instructions complete execution in the context (privilege, protection, and 
address translation) under which they were issued. 

• The rfi instruction copies SRRl bits back into the MSR. 

• The instructions following this instruction execute in the context established by this 
instruction. 

For a complete description of context synchronization, refer to Chapter 6, “Exceptions,” of 
The Programming Environments Manual 

4.3 Process Switching 

The operating system should execute one of the following when processes are switched: 

• The sync instruction, which ordtrs the effects of instruction execution. All 
instructions previously initiated appear to have completed before the sync 
instruction completes, and no subsequent instructions appear to be initiated until the 
sync instruction completes. For an example showing use of the sync instruction, see 
Chapter 2, “PowerPC Register Set,” of The Programming Environments Manual 

• The isync instruction, which waits for all previous instructions to complete and then 
discards any fetched instructions, causing subsequent instructions to be fetched (or 
refetched) from memory and to execute in the context (privilege, translation, 
protection, etc.) established by the previous instructions. 

• The stwcx. instruction, to clear any outstanding reservations, which ensures that an 
Iwarx instruction in the old process is not paired with an stwcx. instruction in the 
new process. 

The operating system should set the MSR[RI] bit as described in Section 4.2.3, “Setting 
MSR[RI].” 
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4.4 Exception Latencies 

Latencies for taking various exceptions depend on the state of the machine when the 
exception conditions occur. This latency may be as short as one cycle, in which case an 
exception is signaled in the cycle following the appearance of the exception condition. The 
latencies are as follows: 

• Hard reset and machine check — ^In most cases, a hard reset or machine check 
exception will have a single-cycle latency. A two-to-three-cycle delay may occur 
only when a predicted instruction is next to complete, and the branch guess that 
forced this instruction to be predicted was resolved to be incorrect. 

• Soft reset — ^The latency of a soft reset exception is affected by recoverability. The 
time to reach a recoverable state may depend on the time needed to complete or 
except an instruction at the point of completion, the time needed to drain the 
completed store queue, and the time waiting for a correct empty state so that a valid 
MSR[IP] may be saved. For lower-priority externally-generated interrupts, a delay 
may be incurred waiting for another interrupt, generated while reaching a 
recoverable state, to be serviced. 

Further delays are possible for other types of exceptions depending on the number and type 
of instructions that must be completed before that exceptions may be serviced. See 
Section 4.1.2, “Summary of Front-End Exception Handling,” to determine possible 
maximum latencies for different exceptions. 

4.5 Exception Definitions 

Table 4-8 shows all the types of exceptions that can occur with the 603e and the MSR bit 
settings when the processor transitions to supervisor mode. The state of these bits prior to 
the exception is typically stored in SRRl. 



Table 4-8. MSR Setting Due to Exception 



Exception 

Type 


MSR Bit 


POW 


TGPR 


ILE 


EE 


PR 


FP 


ME 


FEO 


SE 


BE 


FE1 


IP 


m 


DR 


Rl 


LE 


System reset 


0 


0 


— 


B 


B 


B 


— 


0 


B 


B 


D 


— 


0 


0 


0 


ILE 


Machine 

check 


0 


0 


— 


0 


0 


0 


0 


0 


0 


0 


0 


H 


0 


0 


0 


ILE 


DSI 


0 


0 


— 


B 


B 


B 


— 


0 


0 


0 


0 


— 


0 


0 


0 


ILE 


ISI 


0 


0 


— 


0 


0 


0 


— 


0 


0 


0 


0 


— 


0 


0 


0 


ILE 


External 


0 


0 


B 


0 


0 


0 


— 


0 


0 


0 


0 


— 


0 


0 


0 


ILE 


Alignment 


0 


0 


— 


0 


0 


0 


— 


0 


0 


0 


0 


— 


0 


0 


0 


ILE 


Program 


0 


0 


— 


B 


B 


B 


— 


0 


B 


D 


0 


— 


0 


0 


D 


ILE 



Chapter 4. Exceptions 



4-17 








































































Table 4-8. MSR Setting Due to Exception (Continued) 



Exception 

Type 


MSR Bit 


POW 


TGPR 


ILE 


EE 


PR 


FP 


ME 


FEO 


SE 


BE 


FE1 


IP 


m 


DR 


Rl 


LE 


Floating- 

point 

unavailable 


0 


0 


1 


0 


0 


0 


1 


0 


0 


0 


0 


■ 


0 


0 


0 


ILE 


Decrementer 


0 


0 


— 


D 


D 


D 


— 


0 


0 


0 


0 


— 


D 


D 


0 


ILE 


System call 


0 


0 


— 


D 


D 


D 


— 


0 


0 


0 


0 


— 


D 


D 


0 


ILE 


Trace 

exception 


0 


H 


i 


■ 


0 


i 


i 


0 


0 


0 


0 


H 


B 


0 


0 


ILE 


ITLB miss 


0 


1 


— 


D 


D 


D 


— 


0 


0 


0 


0 


— 


0 


0 


0 


ILE 


DTLB miss 
on load 


0 


1 


— 


0 


0 


0 


— 


0 


0 


0 


0 


— 


0 


0 


0 


ILE 


DTLB miss 
on store 


0 


1 


— 


0 


0 


0 


— 


0 


0 


0 


0 


— 


0 


0 


0 


ILE 


Instruction 

address 

breakpoint 


0 


0 


1 


0 


0 


0 


1 


0 


0 


0 


0 


1 


0 


0 


0 


ILE 


System 

management 

interrupt 


0 


0 


1 


0 


0 


0 


1 


0 


0 


0 


0 


1 


0 


0 


0 


ILE 



0 Bit is cleared 

1 Bit is set 

ILE Bit is copied from the ILE bit in the MSR. 

— Bit is not altered 

Reserved bits are read as if written as 0. 



4.5.1 Reset Exceptions (0x00100) 

The system reset exception is a nonmaskable, a synchrono us exception signaled to the 603e 
either through the assertion of the reset signals (SRESET or HRESET) or internally during 
the power-on reset (FOR) process. The assertion of the soft reset signal, SRESET, as 
described in Section 7.2.9.6.2, “Soft Reset (SRESET) — Input” causes the soft reset 
exception to be taken and the physical base address of the handler is determined by the 
MSR[IP] bit. The assertion of the hard reset signal, HRESET, as described in 
Section 7.2.9.6.1, “Hard Reset (HRESET) — ^Input” causes the hard reset exception to be 
taken and the physical address of the handler is always 0xFFF0_0100. 
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4.5.1 .1 Hard Reset and Power-On Reset 

As described in 4.1.2, “Summary of Front-End Exception Handling,” the hard reset 
exception is a nonrecoverable, nonmaskable asynchronous exception (maskable interrupt). 
When HRESET is asserted or at power-on reset (FOR), the 603e immediately branches to 
0xFFF0_0100 without attempting to reach a recoverable state. A hard reset has the highest 
priority of any exception. It is always nonrecoverable. Table 4-9 shows the state of the 
machine just before it fetches the first instruction of the system reset handler after a hard 
reset. 

The HRESET signal can be asserted for the following reasons: 

• System power-on reset 

• System reset from a panel switch 

• An action required by the ESP utility 

For information on the HRESET signal, see Section 7.2.9.6.1, “Hard Reset (HRESET) — 
Input.” 



Table 4-9. Settings Caused by Hard Reset 



Register 


Setting 


Register 


Setting 


GPRs 


Unknown 


PVR 


oooaooon 


FPRs 


Unknown 


HIDO 


00000000 


FPSCR 


00000000 


MIDI 


00000000 


CR 


All Os 


DMISS and IMISS 


All Os 


SRs 


Unknown 


DCMP and ICMP 


All Os 


MSR 


00000040 


RPA 


All Os 


XER 


00000000 


lABR 


All Os 


TBU 


00000000 


DSISR 


00000000 


TBL 


00000000 


DAR 


00000000 


LR 


00000000 


DEC 


FFFFFFFF 


CTR 


00000000 


HASH1 


00000000 


SDR1 


00000000 


HASH2 


00000000 


SRRO 


00000000 


TLBs 


Unknown 


SRR1 


00000000 


Cache 


All cache blocks invalidated 


SPRGs 


00000000 


BATS 


Unknown 


Tag directory 


All Os. (However, LRU bits are 
initialized so each side of the 
cache has a unique LRU 
value.) 
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The following is also true after a hard reset operation: 

• External checkstops are enabled. 

• The on-chip test interface has given control of the I/Os to the rest of the chip for 
functional use. 

• Since the reset exception has data and instruction translation disabled (MSR[DR] 
and MSR[IR] both cleared), the chip operates in direct address translation mode 
(referred to as the real addressing mode in the architecture specification). 

4.5.1 .2 Soft Reset 

As described in Section 4.1.2, “Summary of Front-End Exception Handling,” the soft reset 
exception is a type of system reset exception that is recoverable, nonmaskable, and 
asynchronous. When SRESET is asserted, the processor attempts to reach a recoverable 
state by allowing the next instruction to either complete or cause an exception, blocking the 
completion of subsequent instructions, and allowing the completed store queue to drain. 

Unlike a hard reset, the latches are not initialized and the instruction cache is disable d. The 
SRESET signal must be asserted for at least two bus clock cycles. After the SRESET signal 
is deasserted, the 603e vectors to the system reset routine at OxFFFOOlOO. 

When a soft reset occurs, registers are set as shown in Table 4-10. 



Table 4-10. Soft Reset Exception— Register Settings 



Register 


Setting Description 


SRRO 


Set to the effective address of the Instruction that the processor would have attempted to complete 
next if no exception conditions were present. 


SRR1 


0-15 Cleared 

1 6-31 Loaded from bits 1 6-31 of the MSR. Note that if the processor state is corrupted to the extent 
that execution cannot be reliably restarted, SRR1 [30] is cleared. 


MSR 


POW 0 EE 0 FEO 0 IR 0 

TGPRO PR 0 SE 0 DR 0 

ILE — FP 0 BE 0 Rl 0 

IP — ME — FE1 0 LE Set to value of ILE 



The vector address for a soft reset is 0x0000_0100 if MSR[IP] is cleared or 0xFFF0_0100 
if MSR[IP] is set. A soft reset is recoverable provided that attaining the recoverable state 
does not cause a machine check exception. This interrupt case is third in priority, following 
hard reset and machine check. 
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4.5.2 Machine Check Exception (0x00200) 

The 603e conditionally initiates a machine check exception after detecting the assertion of 
the TEA or MCP signals on the 603e bus (assuming the machine check is enabled, 
MSR[ME] = 1). The assertion of one of these signals indicates that a bus error occurred and 
the system terminates the current transaction. One clock cycle after the signal is asserted, 
the data bus signals go to the high-impedance state; however, data entering the GPR or the 
cache is not invalidated. Note that if HID0[EMCP] is cleared, the processor ignores the 
assertion of the MCP signal. 

Note that the 603e makes no attempt to force recoverability; however, it does guarantee the 
machine check exception is always taken immediately upon request, with a nonpredicted 
address saved in SRRO, regardless of the current machine state. Any pending stores in the 
completed store queue are cancelled when the exception is taken. Software can use the 
machine check exception in a recoverable mode for checking bus configuration. For this 
case, a sync, load, sync instruction sequence is used. A subsequent machine check 
exception at the load address indicates a bus configuration problem and the processor is in 
a recoverable state. 

If the MSR[ME] bit is set, the exception is recognized and handled; otherwise, the 603e 
attempts to enter an internal checkstop. Note that the resulting machine check exception has 
priority over any exceptions caused by the instruction that generated the bus operation. 

Machine check exceptions are only enabled when MSR[ME] = 1; this is described in 
Section 4.5.2.1, “Machine Check Exception Enabled (MSR[ME] = 1).” If MSR[ME] = 0 
and a machine check occurs, the processor enters the checkstop state. Checkstop state is 
described in 4.5. 2.2, “Checkstop State (MSR[ME] = 0).” 

4.5.2.1 Machine Check Exception Enabied (MSR[ME] = 1) 

When a machine check exception is taken, registers are updated as shown in Table 4-11. 
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Table 4-11. Machine Check Exception— Register Settings 



Register 

SRRO 



SRR1 



MSR 



Setting Description 



Set to the address of the next instruction that would have been completed In the interrupted 
instruction stream. Neither this Instruction nor any others beyond It will have been completed. All 
preceding instructions will have been completed. 

0-11 Clear ed 

12 MCR — Machine check signal caused exception 

1 3 TEA— T ransfer error acknowledge signal caused exception 

14 DPE — Data parity error signal caused exception 

1 5 APE— Address parity error signal caused exception 
16-31 Loaded from MSR[1 6-31]. 



POW 0 


EE 


0 


FEO 


0 


IR 


0 


TGPRO 


PR 


0 


SE 


0 


DR 


0 


ILE — 


FP 


0 


BE 


0 


Rl 


0 


IP — 


ME 


— 


FE1 


0 


LE 


Set to value of ILE 



Note that when a machine check exceptio n is ta ken, the exception handler should set M SR[ME] 
as soon as it is practical to handle another TEA assertion. Otherwise, subsequent TEA assertions 
cause the processor to automatically enter the checkstop state. 



When a machine check exception is taken, instruction execution for the handler begins at 
offset 0x00200 from the physical base address indicated by MSR[IP]. 

In order to return to the main program, the exception handler should do the following: 

1 . SRRO and SRRl should be given the values to be used by the rfi instruction. 

2. Execute rfi. 

4.S.2.2 Checkstop State (MSR[ME] = 0) 

When the 603e enters the checkstop state, it asserts the checkstop output signal, 
CKSTP_OUT. The following events will cause the 603e to enter the checkstop state: 

• Machine check exception occurs with MSR[ME] cleared. 

• External checkstop input, CKSTPJN, is asserted. 

• An extended transfer protocol error occurs. 

When a processor is in the checkstop state, instruction processing is suspended and 
generally cannot be restarted without resetting the processor. The contents of all latches are 
frozen within two cycles upon entering the checkstop state so that the state of the processor 
can be analyzed as an aid in problem determination. 

Note that not all PowerPC processors provide the same level of error checking. The reasons 
a processor can enter checkstop state are implementation-dependent. 



4-22 



PowerPC 603e RISC Microprocessor User's Manual 














4.5.3 DSI Exception (0x00300) 

ADSI exception occurs when no higher priority exception exists and a data memory access 
cannot be performed. The condition that caused the DSI exception can be determined by 
reading the DSISR register, a supervisor-level SPR (SPR18) that can be read by using the 
mfspr instruction. Bit settings are provided in Table 4-12. Table 4-12 also indicates which 
memory element is saved to the DAR. DSI exceptions can occur for any of the following 
reasons: 

• The instruction is not supported for the type of memory addressed. 

• Any access to a direct-store segment (SR[T] = 1). 

• The access violates memory protection. Access is not permitted by the key (Ks and 
Kp) and PP bits, which are set in the segment register and PTE for page protection 
and in the B ATs for block protection. 

Note that the OEA specifies an additional case that may cause a DSI exception — when an 
effective address for a load, store, or cache operation cannot be translated by the TLBs. On 
the 603e, this condition causes a TLB miss exception instead. 

These scenarios are common among all PowerPC processors. The following additional 
scenarios can cause a DSI exception in the 603e: 

• A bus error indicates crossing from a direct-store segment to a memory segment. 

• The execution of any load/store instruction to a direct-store segment, SR[T] = 1. 

• A data access crosses from a memory segment (SR[T] = 0) into a direct-store 
segment (SR[T] = 1). 

DSI exceptions can be generated by load/store instructions, and the cache control 
instructions (dcbi, dcbz, dcbst, and dcbf). 

The 603e supports the crossing of page boundaries. However, if the second page has a 
translation error or protection violation associated with it, the 603e will take the DSI 
exception in the middle of the instruction. In this case, the data address register (DAR) 
always points to a byte address in the first word of the offending page. 

If an stwcx. instruction has an effective address for which a normal store operation would 
cause a DSI exception, the 603e will take the DSI exception without checking for the 
reservation. 

If the XER indicates that the byte count for an Iswi or stswi instruction is zero, a DSI 
exception does not occur, regardless of the effective address. 
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The condition that caused the exception is defined in the DSISR. These conditions also use 
the data address register (DAR) as shown in Table 4-12. 



Table 4-12. DSI Exception— Register Settings 



Register 

SRRO 

SRR1 



MSR 



DSISR 



Setting Description 



Set to the effective address of the instruction that caused the exception. 



0-15 Cleared 

16-31 Loaded with bits 16-31 of the MSR 



POW 0 


EE 


0 


FEO 


0 


IR 


0 


TGPRO 


PR 


0 


SE 


0 


DR 


0 


ILE — 


FP 


0 


BE 


0 


Rl 


0 


IP — 


ME 


— 


FE1 


0 


LE 


Set to value of ILE 



0 Set if a load or store instruction results In a direct-store error exception. 

1 Set by the data TLB miss exception handler if the translation of an attempted access is not 
found In the primary hash table entry group (HTEG), or in the rehashed secondary HTEG, or in 
the range of a DBAT register; otherwise cleared. 

2-3 Cleared 

4 Set if a memory access is not permitted by the page or BAT protection mechanism; otherwise 
cleared. 

5 Set if the Iwarx or stwcx. instruction is attempted to direct-store space. 

6 Set for a store operation and cleared for a load operation. 

7-31 Cleared 



DAR 



Set to the effective address of a memory element as described in the following list: 

• A byte In the first word accessed in the page that caused the DSI exception, for a byte, half word, 
or word memory access. 

• A byte In the first word accessed in the BAT area that caused the DSI exception for a byte, half 
word, or word access to a BAT area. 

• A byte in the block that caused the exception for icbi, dcbz, dcbst, debt, or debi instructions. 

• Any EA in the memory range addressed (for direct-store exceptions). 



When a DSI exception is taken, instruction execution for the handler begins at offset 
0x00300 from the physical base address indicated by MSR[IP]. 

The architecture permits certain instructions to be partially executed when they cause a DSI 
exception. These are as follows: 

• Load multiple or load string instructions — Some registers in the range of registers 
to be loaded may have been loaded. 

• Store multiple or store string instructions — Some bytes of memory in the range 
addressed may have been updated. 

In these cases, the number of registers and amount of memory altered are instruction- and 
boundary-dependent. However, memory protection is not violated. Furthermore, if some of 
the data accessed is in direct-store space (SR[T] = 1) and the instruction is not supported 
for direct-store accesses, the locations in direct-store space are not accessed. 

For update forms, the update register (rA) is not altered. 
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4.5.4 ISI Exception (0x00400) 

The ISI exception is implemented as it is defined by the PowerPC architecture. An ISI 
exception occurs when no higher priority exception exists and an attempt to fetch the next 
instruction fails for any of the following reasons: 

• If an instruction TLB miss fails to find the desired PTE, then a page fault is 
synthesized. The ITLB miss handler branches to the ISI exception handler to 
retrieve the translation from a storage device. 

• An attempt is made to fetch an instruction from a direct-store segment while 
instruction translation is enabled (MSR[IR] = 1). 

• An attempt is made to fetch an instruction from no-execute storage. 

• An attempt is made to fetch an instruction from guarded storage when MSR[IR] = 1 . 

• The fetch access violates memory protection. 

Register settings for this exception are described in Chapter 6, “Exceptions,” in The 
Programming Environments Manual. 

When an ISI exception is taken, instruction execution for the handler begins at offset 
0x00400 from the physical base address indicated by MSR[IP]. 

4.5.5 External Interrupt (0x00500) 

An external interrupt is signaled to the 603e by the assertion of the INT signal as described 
in Section 7.2.9. 1, “Interrupt (INT) — Input.” The interrupt may not be recognized if a 
higher priority exception occurs simultaneously or if the MSR[EE] bit is cleared when INT 
is asserted. 

After the INT is detected (and provided that MSR[EE] is set), the 603e generates a 
recoverable halt to instruction completion. The 603e requires the next instruction in 
program order to complete or except, block completion of any following instructions, and 
allow the completed store queue to drain. If any other exceptions are encountered in this 
process, they are taken first and the external interrupt is delayed until a recoverable halt is 
achieved. At this time the 603e saves the state information and takes the external interrupt 
as defined in the PowerPC architecture. 

The register settings for the external interrupt are shown in Table 4-13. 
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Table 4-13. External Interrupt — Register Settings 



Register 


Setting 


SRRO 


Set to the effective address of the instruction that the processor would have attempted to execute 
next if no interrupt conditions were present. 


SRR1 


0-15 Cleared 

16-31 Loaded from bits 16-31 of the MSR 


MSR 


ROW 0 EE 0 FEO 0 IR 0 

TGPRO PR 0 SE 0 DR 0 

ILE — FP 0 BE 0 Rl 0 

IP — ME — FE1 0 LE Set to value of ILE 



When an external interrupt is taken, instruction execution for the handler begins at offset 
0x00500 from the physical base address indicated by MSR[IP]. 



The 603e only recognizes the interrupt condition (INT asserted) if the MSR[EE] bit is set; 
it ignores the interrupt condition if the MSR[EE] bit is cleared. To guarantee that the 
external interrup t is t aken, the INT signal must be held active until the 603e takes the 
interrupt. If the INT signal is negated before the interrupt is taken, the 603e is not 
guaranteed to take an external interrupt. The interrupt handler must send a command to the 
devi ce that asserted INT, acknowledging the interrupt and instructing the device to negate 

InT. 



4.5.6 Alignment Exception (0x00600) 

This section describes conditions that can cause alignment exceptions in the 603e. Similar 
to DSI exceptions, alignment exceptions use the SRRO and SRRl to save the machine state 
and the DSISR to determine the source of the exception. The 603e will initiate an alignment 
exception when it detects any of the following conditions: 

• The operand of a floating-point load or store operation is not word-aligned. 

• The operand of an Imw, straw, Iwarx, or stwcx. instruction is not word-aligned. 

• A little-endian access (MSR[LE] = 1) is misaligned. 

• A multiple or string access is attempted with the MSR[LE] bit set. 

• The operand of a dcbz instruction is in a page that is write-through or caching- 
inhibited. 

The register settings for alignment exceptions are shown in Table 4-13. 
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Table 4-14. Alignment Interrupt— Register Settings 



Register 

SRRO 

SRR1 



Setting 



Set to the effective address of the instruction that caused the exception. 



0-15 Cleared 

16-31 Loaded from bits 16-31 of the MSR 



MSR 



DSISR 



DAR 



ROW 0 


EE 


0 


FEO 


0 


IR 


0 


TGPRO 


PR 


0 


SE 


0 


DR 


0 


ILE — 


FP 


0 


BE 


0 


Rl 


0 


IP — 


ME 


— 


FE1 


0 


LE 


Set to value of ILE 



0-11 Cleared 

12-13 Cleared. (Note that these bits can be set by several 64-bit PowerPC instructions that are 
not supported in the 603e.) 

14 Cleared 

1 5-1 6 For instructions that use register Indirect with index addressing — set to bits 29-30 of the 
instruction. 

For instructions that use register Indirect with Immediate index addressing — cleared. 

17 For instructions that use register indirect with index addressing— set to bit 25 of the 
Instruction. 

For instructions that use register indirect with Immediate index addressing — Set to bit 5 of 
the instruction 

18-21 For instructions that use register Indirect with Index addressing— set to bits 21-24 of the 
instruction. 

For instructions that use register indirect with immediate Index addressing— set to bits 1-4 
of the instruction. 

22-26 Set to bits 6-10 (identifying either the source or destination) of the instruction. Undefined 
for dcbz. 

27-31 Set to bits 11-15 of the instruction (rA) 

Set to either bits 11-15 of the instruction or to any register number not in the range of 
registers loaded by a valid form instruction, for Imw, Iswi, and Iswx instructions. 

Otherwise undefined. 



Set to the EA of the data access as computed by the instruction causing the alignment exception. 



The architecture does not support the use of an unaligned EA by Iwarx or stwcx. 
instructions. If one of these instructions specifies an unaligned EA, the exception handler 
should not emulate the instruction, but should treat the occurrence as a programming error. 

4. 5.6.1 Integer Alignment Exceptions 

The 603e is optimized for load and store operations that are aligned on natural boundaries. 
Operations that are not naturally aligned may suffer performance degradation, depending 
on the type of operation, the boundaries crossed, and the mode that the processor is in 
during execution. More specifically, these operations may either cause an alignment 
exception or they may cause the processor to break the memory access into multiple, 
smaller accesses with respect to the cache and the memory subsystem. 
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The 603e can initiate an alignment exception for the access shown in Table 4-15. In this 
cases, the appropriate range check is performed before the instruction begins execution. As 
a result, if an alignment exception is taken, it is guaranteed that no portion of the instruction 
has been executed. 



Table 4-15. Access Types 



MSR[DR] 


SR(T] 


Access Type 


1 


0 


Page-address translation access 



4.5.6.1.1 Page Address Translation Access 

A page-address translation access occurs when MSR[DR] is set, SR[T] is cleared and there 
is not a match in the BAT. Note the following points: 

• The following is true for all loads and stores except strings/multiples: 

— Byte operands never cause an alignment exception. 

— Half-word operands can cause an alignment exception if the EA ends in OxFFF. 
— Word operands can cause an alignment exception if the EA ends in OxFFD-FFF. 

— Double- word operands cause an alignment exception if the EA ends in 
0xFF9-FFF. 

• The dcbz instruction causes an alignment exception if the access is to a page or 
block with the W (write-through) or I (cache-inhibit) bit set in the TLB or BAT, 
respectively. 

A misaligned memory access that does not cause an alignment exception will not perform 
as well as an aligned access of the same type. The resulting performance degradation due 
to misaligned accesses depends on how well each individual access behaves with respect 
to the memory hierarchy. At a minimum, additional cache access cycles are required that 
can delay other processor resources from using the cache. More dramatically, for an access 
to a noncacheable page, each discrete access involves individual processor bus operations 
that reduce the effective bandwidth of that bus. 

Finally, note that when the 603e is in page address translation mode, there is no special 
handling for accesses that fall into BAT regions. 

4.5.6.2 Floating-Point Alignment Exceptions 

The 603e implements the alignment exception as it is defined in the PowerPC architecture. 
For information on bit settings and how exception conditions are detected, refer to The 
Programming Environments Manual, 
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Note that the PowerPC architecture allows individual processors to determine whether an 
exception is required to handle various alignment conditions. The 603e initiates an 
alignment exception when it detects any of the following conditions: 

• The operand of a floating-point load or store operation is not word-aligned. 

• The operand of a dcbz instruction is in a page that is write-through or caching- 
inhibited for a virtual mode access. 

• The operand of an Imw, stmw, Iwarx, or stwcx. instruction is not word-aligned. 
Note that unlike other alignment exceptions, which store the address as computed 
by the instruction in the DAR, alignment exceptions for load or store multiple 
instructions store that address value + 4 into the DAR. 

• A little-endian access is misaligned. 

• A multiple access is attempted while the little-endian, MSR[LE], bit is set. 

4.5.7 Program Exception (0x00700) 

The 603e implements the program exception as it is defined by the PowerPC architecture 
(OEA). A program exception occurs when no higher priority exception exists and one or 
more of the exception conditions defined in the OEA occur. 

When a program exception is taken, instruction execution for the handler begins at offset 
0x00700 from the physical base address indicated by MSR[IP]. The exception conditions 
are as follows: 

• Floating-point enabled exception — These exceptions correspond to IEEE-defined 
exception conditions, such as overflows, and divide by zeroes that may occur during 
the execution of a floating-point arithmetic instruction. As a group, these exceptions 
are enabled by the FEO and FEl bits in the in the MSR. Individual conditions are 
enabled by specific bits in the FPSCR. For general information about this exception, 
see The Programming Environments Manual. For more information about how 
these exceptions are implemented in the 603e, see Section 4.5.7. 1, “IEEE Floating- 
Point Exception Program Exceptions.” 

• Illegal instruction — An illegal instruction program exception is generated when 
execution of an instruction is attempted with an illegal opcode or illegal 
combination of opcode and extended opcode fields (including PowerPC instructions 
not implemented in the 603e). These do not include those optional instructions 
treated as no-ops. 

• Privileged instruction — A privileged instruction type program exception is 
generated when the execution of a privileged instruction is attempted and the MSR 
register user privilege bit, MSR[PR], is set. In the 603e, this exception is generated 
for mtspr or mfspr with an invalid SPR field if SPR[0] = 1 and MSR[PR] = 1 . This 
may not be true for all PowerPC processors. 

• Trap — ^A trap type program exception is generated when any of the conditions 
specified in a trap instruction is met. 
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4-5.7-1 IEEE Floating-Point Exception Program Exceptions 

In the 603e, floating-point exceptions are signaled by condition bits set in the floating-point 
status and control register (FPSCR). They can cause the system floating-point enabled 
exception handler to be invoked. The 603e handles all floating-point exceptions precisely. 
The 603e implements the FPSCR as it is defined by the PowerPC architecture; for more 
information about the FPSCR, see The Programming Environments Manual. 

Floating-point operations that change exception sticky bits in the FPSCR may suffer a 
performance penalty. When an exception is disabled in the FPSCR and MSR[FE] = 0, 
updates to the FPSCR exception sticky bits are serialized at the completion stage. This 
serialization may result in a one- or two-cycle execution delay. The penalty is incurred only 
when the exception bit is changed and not on subsequent operations with the same 
exception. See Chapter 6, “Instruction Timing,” for a full description of completion 
serialization. 

When an exception is enabled in the FPSCR, the instruction traps to the emulation trap 
exception vector without updating the FPSCR or the target FPR. The emulation trap 
exception handler is required to complete the instruction. The emulation trap exception 
handler is invoked regardless of the FE setting in the MSR. 

The two IEEE floating-point imprecise modes, defined by the PowerPC architecture when 
MSR[FE0] ^ MSR[FE1], are treated as precise exceptions (that is, MSR[FE0] = 
MSR[FE1] = 1). This is regardless of the setting of MSR[NI]. 

For the highest and most predictable floating-point performance, all exceptions should be 
disabled in the FPSCR and MSR. For more information about the program exception, see 
The Programming Environments Manual. 

4.S.7.2 Illegal, Reserved, and Unimplemented Instructions 
Program Exceptions 

In accordance with the PowerPC architecture, the 603e considers all instructions defined 
for 64-bit implementations and unimplemented optional instructions, such as fsqrt, eciwx, 
and ecowx as illegal and takes a program exception when one of these instructions is 
encountered. Likewise, if a supervisor-level instruction is encountered when the processor 
is in user-level mode, a privileged instruction-type program exception is taken. 

The 603e implements some instructions, such as double-precision floating-point and 
load/store string instructions in software. These instructions take the 603e-specific 
emulation trap exception (0x01600) rather than a program exception. 
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4.5.8 Floating-Point Unavailable Exception (0x00800) 

The floating-point unavailable exception is implemented in the 603e as it is defined in the 
PowerPC architecture. A floating-point unavailable exception occurs when no higher 
priority exception exists, an attempt is made to execute a floating-point instruction 
(including floating-point load, store, and move instructions), and the floating-point 
available bit in the MSR is disabled, (MSR[FP] = 0). Register settings for this exception 
are described in Chapter 6, “Exceptions,” in The Programming Environments Manual 

When a floating-point unavailable exception is taken, instruction execution for the handler 
begins at offset 0x00800 from the physical base address indicated by MSR[IP]. 

4.5.9 Decrementer Interrupt (0x00900) 

The 603e implements the decrementer interrupt exception as it is defined in the PowerPC 
architecture. A decrementer interrupt request is made when the decrementer counts down 
through zero. The request is held until there are no higher priority exceptions and 
MSR[EE] = 1. At this point the decrementer interrupt is taken. If multiple decrementer 
interrupt requests are received before the first can be reported, only one exception is 
reported. The occurrence of a decrementer interrupt cancels the request. Register settings 
for this exception are described in Chapter 6, “Exceptions,” in The Programming 
Environments Manual 

When a decrementer interrupt is taken, instruction execution for the handler begins at offset 
0x00900 from the physical base address indicated by MSR[IP]. 

4.5.10 System Call Exception (OxOOCOO) 

The 603e implements the system call exception as it is defined by the PowerPC 
architecture. A system call exception request is made when a system call (sc) instruction is 
completed. If no higher priority exception exists, the system call exception is taken, with 
SRRO being set to the EA of the instruction following the sc instruction. Register settings 
for this exception are described in Chapter 6, “Exceptions,” in The Programming 
Environments Manual 

When a system call exception is taken, instruction execution for the handler begins at offset 
OxOOCOO from the physical base address indicated by MSR[IP]. 



Chapter 4. Exceptions 



4-31 




4.5.11 Trace Exception (OxOODOO) 

The trace exception is taken under one of the following conditions: 

• When MSR[SE] is set, a single-step instruction trace exception is taken when no 
higher priority exception exists and any instruction (other than rfi or isync) is 
successfully completed. Note that other PowerPC processors will take the trace 
exception on isync instructions (when MSR[SE] is set); the 603e does not take the 
trace exception on isync instructions. Single-step instruction trace mode is described 
in Section 4.5. 11.1, “Single-Step Instruction Trace Mode.” 

• When MSR[BE] is set, the branch trace exception is taken after each branch 
instruction is completed. 

• The 603e deviates from the architecture by not taking trace exceptions on isync 
instructions. Single-step instruction trace mode is described in Section 4.5. 1 1 .2, 
“Branch Trace Mode.” 



Successful completion implies that the instruction caused no other exceptions. A trace 
exception is never taken for an sc instruction or for a trap instruction that takes a trap 
exception. 

MSR[SE] and MSR[BE] are cleared when the trace exception is taken. In the normal use 
of this function, MSR[SE] and MSR[BE] are restored when the exception handler returns 
to the interrupted program using an rfi instruction. 

Register settings for the trace mode are described in Table 4-16. 



Table 4-16. Trace Exception— Register Settings 



Register 



Setting Description 



SRRO 



Set to the address of the instruction following the one for which the trace exception was generated. 



SRR1 



0~15 Cleared 

16-31 Loaded from bits 16-31 of the MSR 



ROW 0 


EE 


0 


FEO 


0 


IR 


0 


TGPRO 


PR 


0 


SE 


0 


DR 


0 


ILE — 


FP 


0 


BE 


0 


Rl 


0 


IP — 


ME 


— 


FE1 


0 


LE 


Set to value of ILE 



Note that a trace or instruction address breakpoint exception condition generates a soft stop 
instead of an exception if soft stop has been enabled by the JTAG/COP logic. If trace and 
breakpoint conditions occur simultaneously, the breakpoint conditions receive higher 
priority. 

When a trace exception is taken, instruction execution for the handler begins as offset 
OxOODOO from the base address indicated by MSR[IP]. 
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4.5.11.1 Single-Step Instruction Trace Mode 

The single-step instruction trace mode is enabled by setting MSR[SE]. Encountering the 
single-step breakpoint causes one of the following actions: 

• Trap to address vector OxOODOO 

• Soft stop (wait for quiescence) 

The default single-step trace action traps after an instruction execution and completion. The 
soft stop option, in which the 603e stops in a restartable state after an instruction execution 
and completion, can be enabled only through the COP function. The ESP, which interfaces 
to the COP, can restart the 603e after a soft stop. Refer to the section on JTAG/COP and 
Section 8.9, “IEEE 11 49.1 -Compliant Interface,” for more information. 

4.5.11.2 Branch Trace Mode 

The branch trace mode is enabled by setting MSR[BE]. Encountering the branch trace 
breakpoint causes one of the following actions: 

• Trap to interrupt vector OxOODOO 

• Soft stop 

• Hard stop 

The default branch trace action is to trap after the completion of any branch instruction 
whenever MSR[BE] is set. However, if soft stop is enabled through the COP interface, the 
603e stops in a restartable state. If hard stop is enabled through the COP interface, the 603e 
stops immediately without waiting to reach a restartable state. Therefore, the 603e js not 
guaranteed to be restartable after a hard stop. For more information, see Section 8.9, “IEEE 
1 149. 1 -Compliant Interface.” 

4.5.12 Instruction TLB Miss Exception (0x01000) 

When the effective address for an instruction load, store, or cache operation cannot be 
translated by the ITLBs, an instruction TLB miss exception is generated. Register settings 
for the instruction and data TLB miss exceptions are described in Table 4-17. 
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Table 4-17. Instruction and Data TLB Miss Exceptions— Register Settings 



Register 


Setting Description 


SRRO 


Set to the address of the next Instruction to be executed in the program for which the TLB miss 
exception was generated. 


SRR1 


0-3 Loaded from Condition register CRO field 
4-12 Cleared 

13 0 = data TLB miss 

1 = instruction TLB miss 

14 0 = replace TLB associativity set 0 
1 = replace TLB associativity set 1 

15 0 = data TLB miss on store (or C = 0) 

1 = data TLB miss on load 

16-31 Loaded from bits 16-31 of the MSR 


MSR 


POW 0 EE 0 FEO 0 IR 0 

TGPR 1 PR 0 SE 0 DR 0 

ILE — FP 0 BE 0 Rl 0 

IP — ME — FE1 0 LE Set to value of ILE 



If the instruction TLB miss exception handler fails to find the desired PTE, then a page fault 
must be synthesized. The handler must restore the machine state and turn off the GPRs 
before invoking the ISI exception (0x00400). 

Software table search operations are discussed in Chapter 5, “Memory Management.” 

When an instruction TLB miss exception is taken, instruction execution for the handler 
begins at offset 0x01000 from the physical base address indicated by MSR[IP]. 

4.5.13 Data TLB Miss on Load Exception (0x01100) 

When the effective address for a data load or cache operation cannot be translated by the 
DTLBs, a data TLB miss on load exception is generated. Register settings for the 
instruction and data TLB miss exceptions are described in Table 4-17. 

If a data TLB miss exception handler fails to find the desired PTE, then a page fault must 
be synthesized. The handler must restore the machine state and turn off MSR[TGPR] 
before invoking the DSI exception (0x00300). 

Software table search operations are discussed in Chapter 5, “Memory Management.” 

When a data TLB miss on load exception is taken, instruction execution for the handler 
begins at offset 0x01100 from the physical base address indicated by MSR[IP]. 
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4.5.14 Data TLB Miss on Store Exception (0x01200) 

When the effective address for a data store or cache operation cannot be translated by the 
DTLBs, a data TLB miss on store exception is generated. The data TLB miss on store 
exception is also taken when the changed bit (C = 0) for a DTLB entry needs to be updated 
for a store operation. Register settings for the instruction and data TLB miss exceptions are 
described in Table 4-17. 

If a data TLB miss exception handler fails to find the desired PTE, then a page fault must 
be synthesized. The handler must restore the machine state and turn off the TGPRs before 
invoking a DSI exception (0x00300). 

Software table search operations are discussed in Chapter 5, “Memory Management.” 

When a data TLB miss on store exception is taken, instruction execution for the handler 
begins at offset 0x01200 from the physical base address indicated by MSR[IP]. 



4.5.15 Instruction Address Breakpoint Exception (0x01300) 

The instruction address breakpoint is controlled by the lABR special purpose register. 
IABR[0-29] holds an effective address to which each instruction is compared. The 
exception is enabled by setting IABR[30]. Note that the 603e ignores the translation enable 
bit (IABR[31]). The exception is taken when an instruction breakpoint address matches on 
the next instruction to complete. The instruction tagged with the match is not completed 
before the instruction address breakpoint exception is taken. 

The breakpoint action can be one of the following: 

• Trap to interrupt vector 0x01300 (default) 

• Soft stop 

The bit settings for when an instruction address breakpoint exception is taken are shown in 
Table 4-18. 



Table 4-18. Instruction Address Breakpoint Exception— Register Settings 



Register 


Setting Description 


SRRO 


Set to the address of the next instruction to be executed in the program for which the TLB miss 
exception was generated. 


SRR1 


0-15 Cleared 

1 6-31 Loaded from bits 1 6-31 of the MSR 


MSR 


ROW 0 EE 0 FEO 0 IR 0 

TGPRO PR 0 SE 0 DR 0 

iLE — FP 0 BE 0 Rl 0 

IP — ME — FE1 0 LE Set to value of ILE 



The default breakpoint action is to trap before the execution of the matching instruction. 
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The soft stop feature can be enabled only through the COP interface. With soft stop 
enabled, the 603e stops in a restartable state, while with hard stop enabled, the 603e stops 
immediately without attempting to reach a restartable state. Upon restarting from a soft 
stop, the matching instructions are executed and completed unless it generates an 
exception. For soft stops, the next ten instructions that could have passed the lABR check 
can be monitored only by single-stepping the processor. When soft stops are used, the 
address compare must be separated by at least 10 instructions. 

If soft stop is enabled, only one soft stop is generated before completion of an instruction 
with an lABR match, regardless of whether a soft stop is generated before that instruction 
for any other reason, such as trace mode on for the preceding instruction or a COP soft stop 
request. 

Table 4-19 shows the priority of actions taken when more than one mode is enabled for the 
same instruction. 



Table 4-19. Breakpoint Action for Multiple Modes Enabled for the Same Address 



IABR[IE] 


MSR[BE] 


MSR[SE] 


First Action 


Next Action 


Comments 


1 


1 


0 


Instruction 

address 


Trace 

(branch) 


Enabling both modes is useful only if both 
trace and address breakpoint interrupts 
are needed. 


1 


0 


1 


Instruction 

address 

breakpoint 


Trace 

(single-step) 


Enabling both modes is useful only if 
different breakpoint actions are required. 


0 


1 


1 


Trace 

(branch) 


None 


The action for branch trace and single-step 
trace is the same. Enabling both trace 
modes is redundant except for hard stop 
on branches. 


1 


1 


1 


Instruction 

address 

breakpoint 


Trace 


Enabling all modes is redundant. This 
entry is for clarification only. 



Note that a trace or instruction address breakpoint exception condition generates a soft stop 
instead of an exception if soft stop has been enabled by the JTAG/COP logic. If trace and 
breakpoint conditions occur simultaneously, the breakpoint conditions receive higher 
priority. 

The 603e requires that an mtspr instruction that updates the lABR be followed by a 
context-synchronizing instruction. If the mtspr instruction enables the instruction address 
breakpoint exception, the context-synchronizing instruction cannot generate a breakpoint 
response. The 603e also cannot block a breakpoint response on the context-synchronizing 
instruction if the breakpoint was disabled by the mtspr instruction. See “Synchronization 
Requirements for Special Registers and TLBs” in Chapter 2, “Register Set,” in The 
Programming Environments Manuar for more information on this requirement. 
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4.5.16 System Management Interrupt (0x01400) 

The system management interrupt behaves like an external interrupt except for the signal 
asserted and the vector taken. A system management interrupt is signaled to the 603e by the 
assertion of the SMI signal. The interrupt may not be recognized if a hig her priority 
exception occurs simultaneously or if the MSR[EE] bit is cleared when SMI is asserted. 
Note that SMI takes priority over INT if they are recognized simultaneously. 

After the SMI is detected (and provided that MSR[EE] is set), the 603e generates a 
recoverable halt to instruction completion. The 603e requires the next instruction in 
program order to complete or except, block completion of any following instructions, and 
allow the completed store queue to drain. If any higher priority exceptions are encountered 
in this process, they are taken first and the system management interrupt is delayed until a 
recoverable halt is achieved. At this time the 603e saves state information and takes the 
system management interrupt. 

The register settings for the external interrupt exception are shown in Table 4-20. 



Table 4-20. System Management Interrupt — Register Settings 



Register 


Setting Description 


SRRO 


Set to the effective address of the Instruction that the processor would have attempted to complete 
next if no interrupt conditions were present. 


SRR1 


0-15 Cleared 

16-31 Loaded from bits 16-31 of the MSR 


MSR 


ROW 0 EE 0 FEO 0 IR 0 

TGPRO PR 0 SE 0 DR 0 

ILE — FP 0 BE 0 Rl 0 

IP — ME — FE1 0 LE Set to value of ILE 



When a system management interrupt is taken, instruction execution for the handler begins 
at offset 0x01400 from the physical base address indicated by MSR[IP]. 

The 603e recognizes the interrupt condition (SMI asserted) only if the MSRfEE] bit is set; 
and ignores the interrupt condition otherwise. To guarantee that the external interrupt is 
taken, the SMI signal must be held active until the 603e takes the interrupt. If the SMI 
signal is negated before the interrupt is taken, the 603e is not guaranteed to take a system 
management interrupt. The interrupt handler must send a command to the device that 
asserted SMI, acknowledging the interrupt and instructing the device to negate SMI. 
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Chapter 5 

Memory Management 

This chapter describes the PowerPC 603e microprocessor’s implementation of the memory 
management unit (MMU) specifications provided by the PowerPC operating environment 
architecture (OEA) for PowerPC processors. The 603e MMU implementation is very 
similar to that of the PowerPC 603 microprocessor except that the 603e implements an 
extra key bit in the SRRl register that simplifies the table search software. In addition, 
because the 603e does not support direct-store bus accesses, it causes a DSI exception when 
a direct-store segment is encountered. Refer to Appendix C, “PowerPC 603 Processor 
System Design and Programming Considerations,” for a complete description of the 
differences applicable to the PowerPC 603 microprocessor. 

The primary function of the MMU in a PowerPC processor is the translation of logical 
(effective) addresses to physical addresses (referred to as real addresses in the architecture 
specification) for memory accesses, and I/O accesses (I/O accesses are assumed to be 
memory-mapped). In addition, the MMU provides access protection on a segment, block, 
or page basis. This chapter describes the specific hardware used to implement the MMU 
model of the OEA in the 603e. Refer to Chapter 7, “Memory Management,” in The 
Programming Environments Manual for a complete description of the conceptual model. 

Two general types of accesses generated by PowerPC processors require address 
translation — instruction accesses, and data accesses to memory generated by load and store 
instructions. Generally, the address translation mechanism is defined in terms of segment 
descriptors and page tables used by PowerPC processors to locate the effective-to-physical 
address mapping for instruction and data accesses. The segment information translates the 
effective address to an interim virtual address, and the page table information translates the 
virtual address to a physical address. 

The segment descriptors, used to generate the interim virtual addresses, are stored as on- 
chip segment registers on 32-bit implementations (such as the 603e). In addition, two 
translation lookaside buffers (TLBs) are implemented on the 603e to keep recently-used 
page address translations on-chip. Although the PowerPC OEA describes one MMU 
(conceptually), the 603e hardware maintains separate TLBs and table search resources for 
instruction and data accesses that can be accessed independently (and simultaneously). 
Therefore, the 603e is described as having two MMUs, one for instruction accesses 
(IMMU) and one for data accesses (DMMU). 
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The block address translation (BAT) mechanism is a software-controlled array that stores 
the available block address translations on-chip. BAT array entries are implemented as 
pairs of BAT registers that are accessible as supervisor special-purpose registers (SPRs). 
There are separate instruction and data BAT mechanisms, and in the 603e, they reside in 
the instruction and data MMUs respectively. 

The MMUs, together with the exception processing mechanism, provide the necessary 
support for the operating system to implement a paged virtual memory environment and for 
enforcing protection of designated memory areas. Exception processing is described in 
Chapter 4, “Exceptions.” Section 4.2, “Exception Processing,” describes the MSR, which 
controls some of the critical functionality of the MMUs. 

5.1 MMU Features 

The 603e implements the memory management specification of the PowerPC OEA for 32- 
bit implementations. Thus, it provides 4 Gbytes of effective address space accessible to 
supervisor and user programs with a 4-Kbyte page size and 256-Mbyte segment size. In 
addition, the MMUs of 32-bit PowerPC processors use an interim virtual address (52 bits) 
and hashed page tables in the generation of 32-bit physical addresses. PowerPC processors 
also have a block address translation (BAT) mechanism for mapping large blocks of 
memory. Block sizes range from 128 Kbyte to 256 Mbyte and are software-programmable. 

The 603e completely implements all features required by the MMU specifications of the 
PowerPC architecture (OEA) for 32-bit implementations. Table 5-1 summarizes all 603e 
MMU features including the architectural features of PowerPC MMUs (defined by the 
OEA) for 32-bit processors and the implementation-specific features provided by the 603e. 



Table 5-1. MMU Features Summary 



Feature Category 


Architecturally Defined/ 
603e-Speclfic 


Feature 


Address ranges 


Architecturally defined 


2^^ bytes of effective address 


2®^ bytes of virtual address 


2^2 bytes of physical address 


Page size 


Architecturally defined 


4 Kbytes 


Segment size 


Architecturally defined 


256 Mbytes 


Block address 
translation 


Architecturally defined 


Range of 128 Kbyte-256 Mbytes sizes 


Implemented with IBAT and DBAT registers in BAT array 


Memory protection 


Architecturally defined 


Segments selectable as no-execute 


Pages selectable as user/supervisor and read-only 


Blocks selectable as user/supervisor and read-only 


Page history 


Architecturally defined 


Referenced and changed bits defined and maintained 
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Table 5-1. MMU Features Summary (Continued) 



Feature Category 


Architecturally Defined/ 
603e-Specific 


Feature 


Page address 
translation 


Architecturally defined 


Translations stored as PTEs in hashed page tables in memory 




Page table size determined by mask in SDR1 register 


TLBs 


Architecturally defined 


Instructions for maintaining optional TLBs (tible instruction in 
603e) 




603e-specific 


64-entry, two-way set associative ITLB 
64-entry, two-way set associative DTLB 


Segment descriptors 


Architecturally defined 


Stored as segment registers on-chip 


Page table search 
support 


603e-specific 


Three MMU exceptions defined: ITLB miss exception, DTLB 
miss on load exception, and DTLB miss on store (or C = 0) 
exception; MMU-related bits set in SRR1 for these exceptions 






IMISS and DMISS registers (missed effective address) 
HASH1 and HASH2 registers (PTEG addr) 

ICMP and DCMP registers (for comparing PTEs) 

RPA register (for loading TLBs) 


j 




tibll rB instruction for loading ITLB entries 
tibid rB instruction for loading DTLB entries 






Shadow registers for GPRO-3 (can use r0-r3 in table search 
handler without corruption of r0-r3 in context that was 
previously executing) 



5.1.1 Memory Addressing 

A program references memory using the effective (logical) address computed by the 
processor when it executes a load, store, or cache instruction, and when it fetches the next 
instruction. The effective address is translated to a physical address according to the 
procedures described in Chapter?, “Memory Management,” in The Programming 
Environments Manual, augmented with information in this chapter. The memory 
subsystem uses the physical address for the access. 

For a complete discussion of effective address calculation, see Section 2.3. 2.3, “Effective 
Address Calculation.” 

5.1.2 MMU Organization 

Figure 5-1 shows the conceptual organization of a PowerPC MMU in a 32-bit 
implementation; note that it does not describe the specific hardware used to implement the 
memory management function for a particular processor. Processors may optionally 
implement on-chip TLBs and may optionally support the automatic search of the page 
tables for PTEs. In addition, other hardware features (invisible to the system software) not 
depicted in the figure may be implemented. 
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Figure 5-2 and Figure 5-3 show the conceptual organization of the 603e instruction and 
data MMUs, respectively. The instruction addresses shown in Figure 5-2 are generated by 
the processor for sequential instruction fetches and addresses that correspond to a change 
of program flow. Data addresses shown in Figure 5-3 are generated by load and store 
instructions and by cache instructions. 

As shown in the figures, after an address is generated, the higher-order bits of the effective 
address, EA0-EA19 (or a smaller set of address bits, EAO-EAn, in the cases of blocks), are 
translated into physical address bits PA0-PA19. The lower-order address bits, A20-A3 1 are 
untranslated and therefore identical for both effective and physical addresses. After 
translating the address, the MMUs pass the resulting 32-bit physical address to the memory 
subsystem. 

In addition to the higher-order address bits, the MMUs automatically keep an indicator of 
whether each access was generated as an instruction or data access and a supervisor/user 
indicator that reflects the state of the PR bit of the MSR when the effective address was 
generated. In addition, for data accesses, there is an indicator of whether the access is for a 
load or a store operation. This information is then used by the MMUs to appropriately direct 
the address translation and to enforce the protection hierarchy programmed by the 
operating system. Section 4.2, “Exception Processing,” describes the MSR, which controls 
some of the critical functionality of the MMUs. 

The figures show the way in which the A20-A26 address bits index into the on-chip 
instruction and data caches to select a cache set. The remaining physical address bits are 
then compared with the tag fields (comprised of bits PA0~PA19) of the four selected cache 
blocks to determine if a cache hit has occurred. In the case of a cache miss, the instruction 
or data access is then forwarded to the bus interface unit which then initiates an external 
memory access. 
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Figure 5-1. MMU Conceptual Block Diagram — 32-Bit Implementations 



Chapter 5. Memory Management 



5-5 



A20-A31 




















A20-A31 



BAtHEAZ 



IMMU 

IBAT Array 



■1 


Segment Registers 


0> 




IBATOU 








< 




IBATOL 


II 






i 










■ 


IBAT3U 




■illll 






■ 


IBAT3L 



HASH1 



HASH2 



RPA 



spBas 

SPR970 

SPR970 

$PR9d2 



I Cache 




.1 \ 




J \ 


TAGS 


■ 






PA0-PA19 



i Cache 
Hit/Miss 



PA0-PA31 

Figure 5-2. PowerPC 603e Microprocessor IMMU Biock Diagram 



5-6 



PowerPC 603e RISC Microprocessor User's Manual 



















PA0-PA31 



Figure 5-3. PowerPC 603e Microprocessor DMMU Block Diagram 
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5.1.3 Address Translation Mechanisms 

PowerPC processors support the following four types of address translation: 

• Page address translation — ^translates the page frame address for a 4-Kbyte page size 

• Block address translation — translates the block number for blocks that range in size 
from 128 Kbyte to 256 Mbyte 

• Direct-store interface address translation — used to generate direct-store interface 
accesses on the external bus; not implemented in the 603e. 

• Real addressing mode translation — when address translation is disabled, the 
physical address is identical to the effective address. 

Figure 5-4 shows the three implemented address translation mechanisms provided by the 
603e MMUs. The segment descriptors shown in the figure control the page address 
translation mechanism. When an access uses page address translation, the appropriate 
segment descriptor is required. In 32-bit implementations, one of the 16 on-chip segment 
registers (which contain segment descriptors) is selected by the four highest-order effective 
address bits. 

A control bit in the corresponding segment descriptor then determines if the access is to 
memory (memory-mapped) or to the direct-store interface space (selected when the direct- 
store translation control bit (T bit) in the corresponding segment descriptor is set). Note that 
the direct-store interface is present only for compatibility with existing I/O devices that 
used this interface. When an access is determined to be to the direct-store interface space, 
the 603e takes a DSI exception as described in Section 4.5.3, “DSI Exception (0x00300)” 
if it is a data access, and takes an ISI exception as described in Section 4.5.4, “ISI Exception 
(0x00400)” if it is an instruction access. 

For memory accesses translated by a segment descriptor, the interim virtual address is 
generated using the information in the segment descriptor. Page address translation 
corresponds to the conversion of this virtual address into the 32-bit physical address used 
by the memory subsystem. In most cases, the physical address for the page resides in an on- 
chip TLB and is available for quick access. However, if the page address translation misses 
in an on-chip TLB, the MMU causes a search of the page tables in memory (using the 
virtual address information and a hashing function) to locate the required physical address. 
When this occurs, the 603e vectors to exception handlers that search the page tables with 
software. 

Block address translation occurs in parallel with page address translation and is similar to 
page address translation; however, fewer higher-order effective address bits are translated 
into physical address bits (more lower-order address bits (at least 17) are untranslated to 
form the offset into a block). Also, instead of segment descriptors and a TLB, block address 
translations use the on-chip BAT registers as a BAT array. If an effective address matches 
the corresponding field of a BAT register, the information in the BAT register is used to 
generate the physical address; in this case, the results of the page translation (occurring in 
parallel) are ignored (even if the segment corresponds to the direct-store interface space). 
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Figure 5-4. Address Translation Types 



Real addressing mode translation occurs when address translation is disabled; in this case 
the physical address generated is identical to the effective address. Instruction and data 
address translation is enabled with the MSR[IR] and MSR[DR] bits, respectively. Thus 
when the processor generates an access, and the corresponding address translation enable 
bit in MSR (MSR[IR] for instruction accesses and MSR[DR] for data accesses) is cleared, 
the resulting physical address is identical to the effective address and all other translation 
mechanisms are ignored. 
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5.1.4 Memory Protection Facilities 

In addition to the translation of effective addresses to physical addresses, the MMUs 
provide access protection of supervisor areas from user access and can designate areas of 
memory as read-only as well as no-execute or guarded. Table 5-2 shows the eight 
protection options supported by the MMUs for pages. 



Table 5-2. Access Protection Options for Pages 



Option 


User Read 


User 

Write 


Supervisor Read 


Supervisor 

Write 


1-Fetch 


Data 


1-Fetch 


Data 


Supervisor-only 


— 


— 


— 


V 


V 


V 


Supervisor-only-no-execute 


— 


— 


— 


— 


V 


V 


Supervisor-write-only 


V 


V 


— 


V 


V 


V 


Supervisor-write-only-no-execute 


— 


V 


— 


— 


V 


V 


Both user/supervisor 




V 


V 


< 


V 


V 


Both user/supervisor-no-execute 


— 


V 


V 


— 


V 


M 


Both read-only 




V 


— 


V 


V 


— 


Both read-only-no-execute 


— 


V 


— 


— 


V 


— 



V access permitted 
— protection violation 



The operating system programs whether instructions can be fetched from an area of 
memory by appropriately using the no-execute option provided in the segment descriptor. 
Each of the remaining options is enforced based on a combination of information in the 
segment descriptor and the page table entry. Thus, the supervisor-only option allows only 
read and write operations generated while the processor is operating in supervisor mode 
(corresponding to MSR[PR] = 0) to access the page. User accesses that map into a 
supervisor-only page cause an exception to be taken. 

Finally, there is a facility in the VEA and OEA that allows pages or blocks to be designated 
as guarded preventing out-of order accesses that may cause undesired side effects. For 
example, areas of the memory map that are used to control I/O devices can be marked as 
guarded so that accesses (for example, instruction prefetches) do not occur unless they are 
explicitly required by the program. 

For more information on memory protection, see “Memory Protection Facilities,” in 
Chapter 7, “Memory Management,” in the The Programming Environments Manual. 



5-10 



PowerPC 603e RISC Microprocessor User's Manual 




















































5.1.5 Page History Information 

The MMUs of PowerPC processors also define referenced (R) and changed (C) bits in the 
page address translation mechanism that can be used as history information relevant to the 
page. This information can then be used by the operating system to determine which areas 
of memory to write back to disk when new pages must be allocated in main memory. While 
these bits are initially programmed by the operating system into the page table, the 
architecture specifies that the R and C bits may be maintained either by the processor 
hardware (automatically) or by some software-assist mechanism that updates these bits 
when required. The software table search routines used by the 603e set the R bit when a 
PTE is accessed; the 603e causes an exception (to vector to the software table search 
routines) when the C bit in the corresponding TLB entry requires updating. 

5.1.6 General Flow of MMU Address Translation 

The following sections describe the general flow used by PowerPC processors to translate 
effective addresses to virtual and then physical addresses. 

5.1 .6.1 Real Addressing Mode and Block Address Translation 
Selection 

When an instruction or data access is generated and the corresponding instruction or data 
translation is disabled (MSR[IR] = 0 or MSR[DR] = 0), real addressing mode translation is 
used (physical address equals effective address) and the access continues to the memory 
subsystem as described in Section 5.2, “Real Addressing Mode.” 

Figure 5-5 shows the flow used by the MMUs in determining whether to select real 
addressing mode, block address translation or to use the segment descriptor to select page 
address translation. 
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Figure 5-5. General Flow of Address Translation (Real Addressing Mode and Block) 



Note that if the BAT array search results in a hit, the access is qualified with the appropriate 
protection bits. If the access violates the protection mechanism, an exception (ISI or DSI 
exception) is generated. 

5.1 .6.2 Page Address Translation Selection 

If address translation is enabled (real addressing mode not selected) and the effective 
address information does not match with a BAT array entry, then the segment descriptor 
must be located. Once the segment descriptor is located, the T bit in the segment descriptor 
selects whether the translation is to a page or to a direct-store interface segment as shown 
in Figure 5-6. Note that the 603e does not implement the direct-store interface, and 
accesses to these segments cause a DSI exception. In addition. Figure 5-6 also shows the 
way in which the no-execute protection is enforced; if the N bit in the segment descriptor 
is set and the access is an instruction fetch, the access is faulted as described in Chapter 7, 
“Memory Management,” in The Programming Environments Manual, Note that the figure 
shows the flow for these cases as described by the PowerPC OEA, and so the TLB 
references are shown as optional. As the 603e implements TLBs, these branches are valid, 
and described in more detail throughout this chapter. 
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Optional to the PowerPC architecture. Implemented in the 603e. 



Figure 5-6. General Flow of Page and Direct-Store Interface Address Translation 
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If the T bit in the corresponding segment descriptor is 0, page address translation is 
selected. The information in the segment descriptor is then used to generate the 52-bit 
virtual address. The virtual address is then used to identify the page address translation 
information (stored as page table entries (PTEs) in a page table in memory). For increased 
performance, the 603e has two TLBs to store recently-used PTEs on-chip. 

If an access hits in the appropriate TLB, the page translation occurs and the physical 
address bits are forwarded to the memory subsystem. If the required PTE is not resident, 
the MMU requires a search of the page table. In this case, the 603e traps to one of three 
exception handlers for the system software to perform the page table search. If the PTE is 
successfully matched, a new TLB entry is created and the page translation is once again 
attempted. This time, the TLB is guaranteed to hit. Once the PTE is located, the access is 
qualified with the appropriate protection bits. If the access is a protection violation (not 
allowed), an exception (instruction access or data access) is generated. 

If the PTE is not found by the table search operation, a page fault condition exists, and the 
TLB miss exception handlers synthesize either an ISI or DSI exception to handle the page 
fault. 



5.1.7 MMU Exceptions Summary 

In order to complete any memory access, the effective address must be translated to a 
physical address. In the 603e, an MMU exception condition occurs if this translation fails 
for one of the following reasons: 

• Page fault — there is no valid entry in the page table for the page specified by the 
effective address (and segment descriptor) and there is no valid BAT translation. 

• An address translation is found but the access is not allowed by the memory 
protection mechanism. 

Additionally, because the 603e relies on software to perform table search operations, the 
processor also takes an exception when: 

• There is a miss in the corresponding (instruction or data) TLB. 

• The page table requires an update to the changed (C) bit. 

The state saved by the processor for each of these exceptions contains information that 
identifies the address of the failing instruction. Refer to Chapter 4, “Exceptions,” for a more 
detailed description of exception processing. 
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Because a page fault condition (PTE not found in the page tables in memory) is detected 
by the software that performs the table search operation (and not the 603e hardware), it does 
not cause 603e exception in the strictest sense in that exception processing as described in 
Chapter 4, “Exceptions,” does not occur. However, in order to maintain architectural 
compatibility with software written for other PowerPC devices, the software that detects 
this condition should synthesize an exception by setting the appropriate bits in the DSISR 
or SRRl and branching to the ISI or DSI exception handler. Refer to Section 5.5.2, “Table 
Search Operation with the PowerPC 603e Microprocessor,” for more information and 
examples of this exception software. The remainder of this chapter assumes that the table 
search software emulates this exception and refers to this condition as an exception. 

The translation exception conditions defined by the OEA for 32-bit implementations cause 
either the ISI or the DSI exception to be taken as shown in Table 5-3. 



Table 5-3. Translation Exception Conditions 



Condition 


Description 


Exception 


Page fault (no PTE found) 


No matching PTE found in page tables (and 
no matching BAT array entry) 


i access; iSi exception* 
SRR1[1] = 1 


D access: DSI exception* 
DSISR[1] =1 


Block protection violation 


Conditions described for block in “Block 
Memory Protection” in Chapter 7, “Memory 
Management,” in The Programming 
Environments Manual.'" 


1 access: ISI exception 
SRR1[4] = 1 


D access: DSI exception 
DSISR[4] =1 


Page protection violation 


Conditions described for page in “Page 
Memory Protection” in Chapter 7, “Memory 
Management,” in The Programming 
Environments Manual. 


1 access: ISI exception** 
SRR1[4] = 1 


D access; DSI exception** 
DSISR[4] =1 


No-execute protection violation 


Attempt to fetch Instruction when SR[N] = 1 


ISI exception 
SRR1[31 = 1 


Instruction fetch from direct-store 
segment 


Attempt to fetch instruction when SR[T] = 1 


ISI exception 
SRRl [3] =1 


Data access to direct-store segment 
(including floating-point accesses) 
Note: this is a 603e-specific 
condition 


Attempt to perform load or store (including FP 
load or store) when SR[T] = 1 


DSI exception 
DSISR[51 =1 


Instruction fetch from guarded 
memory with MSR[IR] = 1 


Attempt to fetch instruction when MSR[IR] = 1 
and either matching xBAT[G] = 1 , or no 
matching BAT entry and PTE[G] = 1 


ISi exception 
SRRl [3] =1 



* The 603e hardware does not vector to these exceptions automatically. It is assumed that the software 
that performs the table search operations vectors to these exceptions and sets the appropriate bits when 
a page fault condition occurs. 



**The table search software can also vector to these exception conditions 
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In addition to the translation exceptions, there are other MMU-related conditions (some of 
them defined as implementation-specific and therefore, not required by the architecture) 
that can cause an exception to occur in the 603e. These exception conditions map to the 
processor exception as shown in Table 5-4. For example, the 603e also defines three 
exception conditions to support software table searching. The only exception conditions 
that occur when MSR[DR] = 0 are the conditions that cause the alignment exception for 
data accesses. For more detailed information about the conditions that cause the alignment 
exception (in particular for string/multiple instructions), see Section 4.5.6, “Alignment 
Exception (0x00600).” 

Note that some exception conditions depend upon whether the memory area is set up as 
write-though (W = 1) or cache-inhibited (I = 1). These bits are described fully in 
“Memory/Cache Access Attributes,” in Chapter 5, “Cache Model and Memory 
Coherency,” of The Programming Environments Manual Refer to Chapter 4, 
“Exceptions,” and to Chapter 6, “Exceptions,” in The Programming Environments Manual 
for a complete description of the SRRl and DSISR bit settings for these exceptions. 



Table 5-4. Other MMU Exception Conditions for the PowerPC 603e Processor 



Condition 


Description 


Exception 


TLB miss for an instruction fetch 


No matching entry found in ITLB 


Instruction TLB miss exception 
SRR1I13] = 1 
MSR[14] = 1 


TLB miss for a data access 


No matching entry found in DTLB for 
data access 


Load: data TLB miss on load 
exception 
MSR[14] = 1 


Store: data TLB miss on store 
exception 
SRR1[15] =1 
MSR[14] = 1 


Store operation and C = 0 


Matching DLTB entry has C = 0 and 
access is a store 


Data TLB miss on store exception 
SRR1[15] =1 
MSR[14] = 1 


dcbz with W = 1 or I = 1 


dcbz instruction to write-through or 
cache-inhibited segment or block 


Alignment exception (not required 
by architecture for this condition) 


dcbz when the data cache is 
locked 


The dcbz instruction takes an 
alignment exception If the data cache 
is locked (HIDO bits 18 and 19) when It 
is executed. 


Alignment exception 


Iwarx or stwcx. with W = 1 


Reservation instruction to write- 
through segment or block 


DSI exception DSISR[5] = 1 


Iwarx, stwcx., eciwx, or ecowx 
instruction to direct-store 
segment 


Reservation instruction or external 
control Instruction when SR[T] =1 


DSI exception 
DSISR[5] = 1 


Floating-point load or store to 
direct-store segment 


FP memory access when SRIT] = 1 


See data access to direct-store 
segment in Table 5-3. 
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Table 5-4. Other MMU Exception Conditions for the PowerPC 603e Processor 



Condition 


Description 


Exception 


Load or store that results in a 
direct-store error 


Does not occur in 603e 


Does not apply 


eciwx or ecowx attempted when 
external control facility disabled 


eciwx or ecowx attempted with 
EAR[E] = 0 


DSI exception 
DSISR[11] = 1 


Imw, stmw, Iswi, iswx, stswi, or 
stswx instruction attempted in 
little-endian mode 


Imw, stmw, Iswi, Iswx, stswi, or 
stswx instruction attempted while 
MSR[LE] = 1 


Alignment exception 


Operand misalignment 


Translation enabled and operand Is 
misaligned as described in Chapter 4, 
“Exceptions.” 


Alignment exception (some of these 
cases are implementation-specific) 



5.1.8 MMU Instructions and Register Summary 

The MMU instructions and registers provide the operating system with the ability to set up 
the block address translation areas and the page tables in memory. 

Note that because the implementation of TLBs is optional, the instructions that refer to 
these structures are also optional. However, as these structures serve as caches of the page 
table, the architecture specifies a software protocol for maintaining coherency between 
these caches and the tables in memory whenever changes are made to the tables in memory. 
When the tables in memory are changed, the operating system purges these caches of the 
corresponding entries, allowing the translation caching mechanism to refetch from the 
tables when the corresponding entries are required. 

Note that the 603e implements all TLB -related instructions except tibia, which is treated 
as an illegal instruction. The 603e also uses some implementation-specific instructions to 
load two on-chip TLBs. 

Because the MMU specification for PowerPC processors is so flexible, it is recommended 
that the software that uses these instructions and registers be “encapsulated” into 
subroutines to minimize the impact of migrating across the family of implementations. 

Table 5-5 summarizes 603e instructions that specifically control the MMU. For more 
detailed information about the instructions, refer to Chapter 2, “PowerPC 603e 
Microprocessor Programming Model,” in this book and Chapter 8, “Instruction Set,” in The 
Programming Environments Manual. 



Table 5-5. PowerPC 603e Microprocessor Instruction Summary— Control MMUs 



Instruction 


Description 


mtsr SR,rS 


Move to Segment Register 




SR[SR#]<~ rS 


mtsrin rS,rB 


Move to Segment Register Indirect 




SR[rB[0-3]]<-rS 
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Table 5-5. PowerPC 603e Microprocessor Instruction Sumtnary— Control MMUs 



Instruction 


Description 


mfsr rD,SR 


Move from Segment Register 
rD<-SR[SR#] 


mfsrin rD,rB 


Move from Segment Register Indirect 
rD<-SR[rB[(>-3]] 


tibie rB* 


TLB Invalidate Entry 

For effective address specified by rB, TLB[V]f-0 

The tibie instruction invalidates ^th TLB entries indexed by the EA, and operates on 
both the instruction and data TLBs simultaneously Invalidating four TLB entries. The 
index corresponds to bits 15-19 of the EA. 


tibsync* 


TLB Synchronize 

Synchronizes the execution of all other tibie instructions in the system. In the 603e, 
when the TLBiSYnC signal is negated, instruction execution may continue or resume 
after the completion of a tibsync instruction. When the TLBISYNC signal is asserted. 
Instruction execution stops after the completion of a tibsync instruction. 


tibli 

(603e-specific) 


Load Instruction TLB Entry 

Loads the contents of the ICMP and RPA registers into the ITLB 


tibid 

(603e-speclfic) 


Load Data TLB Entry 

Loads the contents of the DCMP and RPA registers Into the DTLB 



*These Instructions are defined by the PowerPC architecture, but are optional. 



Table 5-6 summarizes the registers that the operating system uses to program the 603e 
MMUs. These registers are accessible to supervisor-level software only. These registers are 
described in Chapter 2, “Register Set,” in The Programming Environments Manual. For 
603e-specific registers, see Chapter 2, “PowerPC 603e Microprocessor Programming 
Model,” of this book. 



Table 5-6. PowerPC 603e Microprocessor MMU Registers 



Register 


Description 


Segment registers 
(SR0-SR15) 


The sixteen 32-bit segment registers are present only In 32-bit implementations of 
the PowerPC architecture. The fields in the segment register are interpreted 
differently depending on the value of bit 0. The segment registers are accessed by 
the mtsr, mtsrin, mfsr, and mfsrin Instructions. 


BAT registers 
(IBAT0U-IBAT3U, 
IBAT0L-IBAT3L, 
DBAT0U-DBAT3U, and 
DBAT0L-DBAT3L) 


There are 16 BAT registers, organized as four pairs of Instruction BAT registers 
(IBAT0U-IBAT3U paired with IBAT0L-IBAT3L) and four pairs of data BAT registers 
(DBAT0U-DBAT3U paired with DBAT0L-DBAT3L). The BAT registers are defined 
as 32-blt registers in 32-bit implementations. These are special-purpose registers 
that are accessed by the mtspr and mfspr instructions. 


SDR1 


The SDR1 register specifies the variable used in accessing the page tables In 
memory. SDR1 is defined as a 32-bit register for 32-bit implementations. This is a 
special-purpose register that is accessed by the mtspr and mfspr instructions. 


Instruction TLB miss 
address and data TLB 
miss address registers 
(IMISS and DMISS) 


When a TLB miss exception occurs, the IMISS or DMISS register contains the 32-bit 
effective address of the instruction or data access, respectively, that caused the 
miss. Note that the 603e always loads a big-endian address into the DMISS register. 
These registers are 603e-specific. 



5-18 



PowerPC 603e RISC Microprocessor User's Manual 































Table 5-6. PowerPC 603e Microprocessor MMU Registers (Continued) 



Register 


Description 


Primary and secondary 
hash address registers 
(HASH1 and HASH2) 


The HASH1 and HASH2 registers contain the primary and secondary PTEG 
addresses that correspond to the address causing a TLB miss. These PTEG 
addresses are automatically derived by the 603e by performing the primary and 
secondary hashing function on the contents of IMISS or DMISS, for an ITLB or 
DTLB miss exception, respectively. 

These registers are 603e-specific. 


Instruction and data PTE 
compare registers 
(ICMP and DCMP) 


The ICMP and DCMP registers contain the word to be compared with the first word 
of a PTE in the table search software routine to determine if a PTE contains the 
address translation for the instruction or data access. The contents of ICMP and 
DCMP are automatically derived by the 603e when a TLB miss exception occurs. 
These registers are 603e-specific. 


Required physical address 
register (RPA) 


The system software loads a TLB entry by loading the second word of the matching 
PTE entry into the RPA register and then executing the tibli or tibid instruction (for 
loading the ITLB or DTLB, respectively). 

This register is 603e-specific. 



Note that the 603e contains other features that don’t specifically control the 603e MMU but 
are implemented to increase performance and flexibility. These are: 

• Complete set of shadow segment registers for the instruction MMU. These registers 
are invisible to the programming model, as described in Section 5.4.3, “TLB 
Description.” 

• Temporary GPR0-GPR3 . These registers are available as r0-r3 when MSR[TGPR] 
is set. The 603e automatically sets MSR[TGPR] whenever one of the three TLB 
miss exceptions occurs, allowing these exception handlers to have four registers that 
are used as scratchpad space, without having to save or restore this part of the 
machine state that existed when the exception occurred. Note that MSR[TGPR] is 
restored to the value in SRRl when the rfi instruction is executed. Refer to 
Section 5.5.2, “Table Search Operation with the PowerPC 603e Microprocessor,” 
for code examples that take advantage of these registers. 

In addition, the 603e also automatically saves the values of CR[CR0] of the executing 
context to SRRl [0-3] whenever one of the three TLB miss exceptions occurs. Thus, the 
exception handler can set CR[CR0] bits and branch accordingly in the exception handler 
routine, without having to save the existing CR[CR0] bits. However, the exception handler 
must restore these bits to CR[CR0] before executing the rfi instruction. There are also four 
other bits saved in SRRl whenever a TLB miss exception occurs that give information 
about whether the access was an instruction or data access; and if it was a data access, 
whether it was for a load or a store instruction. Also these bits give some information 
related to the protection attributes for the access, and which set in the TLB will be replaced 
when the next TLB entry is loaded. Refer to Section 5.5.2. 1, “Resources for Table Search 
Operations,” for more information on these bits and their use. 
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5.2 Real Addressing Mode 

If address translation is disabled (MSR[IR] = 0 or MSR[DR] = 0) for a particular access, 
the effective address is treated as the physical address and is passed directly to the memory 
subsystem as described in Chapter?, “Memory Management,” in The Programming 
Environments Manual, 

Note that the default WIMG bits (ObOOll) cause data accesses to be considered cacheable 
(I = 0) and thus load and store accesses are weakly ordered. If I/O devices require load and 
store accesses to occur in strict program order (strongly ordered), translation must be 
enabled so that the corresponding I bit can be set. Also, for instruction accesses, the default 
memory access mode bits (WIMG) are ObOOOl. That is, the instruction cache is enabled (I 
= 0), and the memory is guarded. The W and M bits have no effect on the instruction cache. 

For information on the synchronization requirements for changes to MSR[IR] and 
MSR[DR], refer to “Synchronization Requirements for Special Registers and for 
Lookaside Buffers” in Chapter 2, “PowerPC Register Set,” in The Programming 
Environments Manual. 

5.3 Block Address Translation 

The block address translation (BAT) mechanism in the OEA provides a way to map ranges 
of effective addresses larger than a single page into contiguous areas of physical memory. 
Such areas can be used for data that is not subject to normal virtual memory handling 
(paging), such as a memory-mapped display buffer or an extremely large array of numerical 
data. 

The software model for block address translation in the 603e is described in Chapter 7, 
“Memory Management,” in The Programming Environments Manual for 32-bit 
implementations. 

Implementation Note — ^The 603e BAT registers are not initialized by the hardware after 
the power-up or reset sequence. Consequently, all valid bits in both instruction and data 
BAT areas must be cleared before setting any BAT area for the first time. This is true 
regardless of whether address translation is enabled. Also, software must avoid overlapping 
blocks while updating a BAT area or areas. Even if translation is disabled, multiple BAT 
area hits are treated as programming errors and can corrupt the BAT registers and produce 
unpredictable results. 
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5.4 Memory Segment Model 

The 603e adheres to the memory segment model as defined in Chapter?, “Memory 
Management,” in The Programming Environments Manual for 32-bit implementations. 
Memory in the PowerPC OEA is divided into 256-Mbyte segments. This segmented 
memory model provides a way to map 4-Kbyte pages of effective addresses to 4-Kbyte 
pages in physical memory (page address translation), while providing the programming 
flexibility afforded by a large virtual address space (52 bits). 

The segment/page address translation mechanism may be superseded by the block address 
translation (BAT) mechanism described in Section 5.3, “Block Address Translation.” If 
not, the translation proceeds in the following two steps: 

1 . from effective address to the virtual address (which never exists as a specific entity 
but can be considered to be the concatenation of the virtual page number and the 
byte offset within a page), and 

2. from virtual address to physical address. 

This section highlights those areas of the memory segment model defined by the OEA that 
are specific to the 603e. 

5.4.1 Page History Recording 

Referenced (R) and changed (C) bits reside in each PTE to keep history information about 
the page. They are maintained by a combination of the 603e hardware and the table search 
software. The operating system uses this information to determine which areas of memory 
to write back to disk when new pages must be allocated in main memory. Referenced and 
changed recording is performed only for accesses made with page address translation and 
not for translations made with the BAT mechanism or for accesses that correspond to 
direct-store interface (T = 1) segments. Furthermore, R and C bits are maintained only for 
accesses made while address translation is enabled (MSR[IR] = 1 or MSR[DR] =1). 

In the 603e, the referenced and changed bits are updated as follows: 

• For TLB hits, the C bit is updated according to Table 5-7. 

• For TLB misses, when a table search operation is in progress to locate a PTE, the R 
and C bits are updated (set, if required) to reflect the status of the page based on this 
access. 
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Table 5-^7. Table Search Operations to Update History Bits— TLB Hit Case 



R and C Bits 
in TLB entry 


Processor Action 


00 


Combination doesn’t occur 


01 


Combination doesn’t occur 


10 


Read; No special action 

Write: Table search operation required to update C. 

Causes a data TLB miss on store exception 


11 


No special action for read or write 



The table shows that the status of the C bit in the TLB entry (in the case of a TLB hit) is 
what causes the processor to update the C bit in the PTE (the R bit is assumed to be set in 
the page tables if there is a TLB hit). Therefore, when software clears the R and C bits in 
the page tables in memory, it must invalidate the TLB entries associated with the pages 
whose referenced and changed bits were cleared. 

The 603e causes the R bit to be set for the execution of the debt or debtst instruction to that 
page (by causing a TLB miss exception to load the TLB entry in the case of a TLB miss). 
However, neither of these instructions cause the C bit to be set. 

The update of the referenced and changed bits is performed by PowerPC processors as if 
address translation were disabled (real addressing mode translation). Additionally, these 
updates should be performed with single-beat read and byte write transactions on the bus. 

5.4.1 .1 Referenced Bit 

The referenced (R) bit of a page is located in the PTE in the page table. Every time a page 
is referenced (with a read or write access) and the R bit is zero, the R bit is then set in the 
page table. The OEA specifies that the referenced bit may be set immediately, or the setting 
may be delayed until the memory access is determined to be successful. Because the 
reference to a page is what causes a PTE to be loaded into the TLB, the referenced bit in all 
603e TLB entries is effectively always set. The processor never automatically clears the 
referenced bit. 

The referenced bit is only a hint to the operating system about the activity of a page. At 
times, the referenced bit may be set although the access was not logically required by the 
program or even if the access was prevented by memory protection. Examples of this in 
PowerPC systems include the following: 

• Fetching of instructions not subsequently executed 

• Accesses generated by an Iswx or stswx instruction with a zero length 

• Accesses generated by a stwex. instruction when no store is performed because a 
reservation does not exist 

• Accesses that cause exceptions and are not completed 
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5.4.1 .2 Changed Bit 

The changed bit of a page is located both in the PTE in the page table and in the copy of 
the PTE loaded into the TLB (if a TLB is implemented, as in the 603e). Whenever a data 
store instruction is executed successfully, if the TLB search (for page address translation) 
results in a hit, the changed bit in the matching TLB entry is checked. If it is already set, 
the processor does not change the C bit. If the TLB changed bit is 0, it is set and a table 
search operation is performed to also set the C bit in the corresponding PTE in the page 
table. The 603e causes a data TLB miss on store exception for this case so that the software 
can perform the table search operation for setting the C bit. Refer to Section 5.5.2, “Table 
Search Operation with the PowerPC 603e Microprocessor,” for an example code sequence 
that handles these conditions. 

The changed bit (in both the TLB and the PTE in the page tables) is set only when a store 
operation is allowed by the page memory protection mechanism and all conditional 
branches occurring earlier in the program have been resolved (such that the store is 
guaranteed to be in the execution path). Furthermore, the following conditions may cause 
the C bit to be set: 

• The execution of an stwcx. instruction is allowed by the memory protection 
mechanism but a store operation is not performed because no reservation exists. 

• The execution of an stswx instruction is allowed by the memory protection 
mechanism but a store operation is not performed because the specified length is 
zero. 

• The store operation is not performed because an exception occurs before the store is 
performed. 

Again, note that although the execution of the debt and debtst instructions may cause the 
R bit to be set, they never cause the C bit to be set. 

5.4.1. 3 Scenarios for Referenced and Changed Bit Recording 

This section provides a summary of the model (defined by the OEA) that is used by 
PowerPC processors for maintaining the referenced and changed bits. In some scenarios, 
the bits are guaranteed to be set by the processor, in some scenarios, the architecture allows 
that the bits may be set (not absolutely required), and in some scenarios, the bits are 
guaranteed to not be set. 

In implementations that do not maintain the R and C bits in hardware (such as the 603e), 
software assistance is required. For these processors, the information in this section still 
applies, except that the software performing the updates is constrained to the rules 
described (i.e., must set bits shown as guaranteed to be set and must not set bits shown as 
guaranteed to not be set). 
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Table 5-8 defines a prioritized list of the R and C bit settings for all scenarios. The entries 
in the table are prioritized from top to bottom, such that a matching scenario occurring 
closer to the top of the table takes precedence over a matching scenario closer to the bottom 
of the table. For example, if an stwcx. instruction causes a protection violation and there is 
no reservation, the C bit is not altered, as shown for the protection violation case. Note that 
in the table, load operations include those generated by load instructions, by the eciwx 
instruction, and by the cache management instructions that are treated as a load with respect 
to address translation. Similarly, store operations include those operations generated by 
store instructions, by the ecowx instruction, and by the cache management instructions that 
are treated as a store with respect to address translation. In the columns for the 603e, the 
combination of the 603e itself and the software used to search the page tables (described in 
Section 5.5.2, “Table Search Operation with the PowerPC 603e Microprocessor”) is 
assumed. 



Table 5-8. Model for Guaranteed R and C Bit Settings 



Priority 


Scenario 


R Bit Set 


C Bit Set 


OEA 


603e 


OEA 


603e 


1 


No>execute protection violation 


maybe 


no 


no 


no 


2 


Page protection violation 


maybe 


yes 


no 


no 


3 


Out-of-order Instruction fetch or load operation 


maybe 


no 


no 


no 


■1 


Out-of-order store operation contingent on a branch, 
trap, sc or rfl instruction, or a possible exception 


maybe 


no 


no 


no 


5 


Out-of-order store operation contingent on an 
exception, other than a trap or sc instruction, not 
occurring 


maybe^ 


no 


maybe^ 


no 


6 


Zero-length load (Iswx) 


maybe 


yes 


no 


no 


7 


Zero-length store (stswx) 


maybe^ 


yes 


maybe^ 


yes 


8 


Store conditional (stwcx.) with no reservation 


maybe^ 


yes 


maybe^ 


yes 


9 


in-order instruction fetch 


yes^ 


yes 


no 


no 


10 


Load instruction or eciwx 


yes 


yes 


no 


no 


11 


Store instruction, ecowx, or dcbz instruction 


yes 


yes 


yes 


yes 


12 


debt, debtst, debst, or debt instruction 


maybe 


yes 


no 


no 


13 


Icbi 


maybe 


no 


no 


no 


14 


debi instruction 


maybe^ 


■ 

yes 


maybe^ 


yes 



^ If C is set, R is guaranteed to also be set 

^ This includes the case in which the instruction was fetched out-of-order and R was not set 
(does not apply for 603e). 
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For more information, see “Page History Recording” in Chapter 7, “Memory 
Management,” of The Programming Environments Manual. 

5.4.2 Page Memory Protection 

The 603e implements page memory protection as it is defined in Chapter 7, “Memory 
Management,” in The Programming Environments Manual. 

5.4.3 TLB Description 

This section describes the hardware resources provided in the 603e to facilitate the page 
address translation process. Note that the hardware implementation of the MMU is not 
specified by the architecture, and while this description applies to the 603e, it does not 
necessarily apply to other PowerPC processors. 

5.4.3.1 TLB Organization 

Because the 603e has two MMUs (IMMU and DMMU) that operate in parallel, some of 
the MMU resources are shared, and some are actually duplicated (shadowed) in each MMU 
to maximize performance. Figure 5-7 shows the relationships between these resources 
within both the IMMU and DMMU and how the various portions of the effective address 
are used in the address translation process. 
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EA0-EA31 Segment Registers 




Figure 5-7. Segment Register and TLB Organization 

While both MMUs can be accessed simultaneously (both sets of segment registers and 
TLBs can be accessed in the same clock), when there is an exception condition, only one 
exception is reported at a time. ITLB miss exceptions are reported when there are no more 
instructions to be dispatched or retired (the pipeline is empty), and DTLB miss conditions 
are reported when the load or store instruction is ready to be retired. Refer to Chapter 6, 
“Instruction Timing,” for more detailed information about the internal pipelines and the 
reporting of exceptions. 
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As TLB entries are on-chip copies of PTEs in the page tables in memory, they are similar 
in structure. TLB entries consist of two words; the upper-order word contains the VSID and 
API fields of the upper-order word of the PTE and the lower-order word contains the RPN, 
the C bit, the WIMG bits and the PP bits (as in the lower-order word of the PTE). In order 
to uniquely identify a TLB entry as the required PTE, the PTE also contains five more bits 
of the page index, EA10-EA14 (in addition to the API bits of the PTE). 

When an instruction or data access occurs, the effective address is routed to the appropriate 
MMU. EA0-EA3 select one of the 16 segment registers and the remaining effective 
address bits and the virtual address from the segment register is passed to the TLB. 
EA15-EA19 then select two entries in the TLB; the valid bit is checked and EA10-EA14, 
the VSID, and API fields for the access are then compared with the corresponding values 
in the TLB entries. If one of the entries hits, the PP bits are checked for a protection 
violation, and the C bit is checked. If these bits don’t cause an exception, the RPN value is 
passed to the memory subsystem and the WIMG bits are then used as attributes for the 
access. 

Although address translation is disabled on a reset condition, the valid bits of the BAT array 
and TLB entries are not automatically cleared. Thus TLB entries must be explicitly cleared 
by the system software (with the tlbie instruction) before the valid entries are loaded and 
address translation is enabled. 

S.4.3.2 TLB Entry Invalidation 

For the PowerPC processors, such as the 603e, that implement TLB structures to maintain 
on-chip copies of the PTEs that are resident in physical memory, the optional tlbie 
instruction provides a way to invalidate the TLB entries. Note that the execution of the tlbie 
instruction in the 603e invalidates four entries — ^both the ITLB entries indexed by 
EA15-EA19 and both the indexed entries of the DTLB. 

The architecture allows tlbie to optionally enable a TLB invalidate signaling mechanism in 
hardware so that other processors also invalidate their resident copies of the matching PTE. 
The 603e does not signal the TLB invalidation to other processors nor does it perform any 
action when a TLB invalidation is performed by another processor. 

The tlbsyn c instruction c auses instruction execution to stop if the TLBIS YNC signal is also 
asserted. If TLBIS YNC is negated, instruction execution may continue or resume after the 
completion of a tlbsync instruction. Section 8.8.2, “TLBISYNC Input,” describes the TLB 
synchronization mechanism in further detail. 

The tibia instruction is not implemented on the 603e and when its opcode is encountered, 
an illegal instruction program exception is generated. To invalidate all entries of both TLBs, 
32 tlbie instructions must be executed, incrementing the value in EA15-EA19 by one each 
time. See Chapter 8, “Instruction Set,” in The Programming Environments Manual for 
detailed information about the tlbie instruction. 
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5.4.4 Page Address Translation Summary 

Figure 5-8 provides the detailed flow for the page address translation mechanism. The 
figure includes the checking of the N-bit in the segment descriptor and then expands on the 
“TLB Hit” branch of Figure 5-6. The detailed flow for the “TLB Miss” branch of 
Figure 5-6 is described in Section 5.5.1, “Page Table Search Operation — Conceptual 
Flow.” Note that as in the case of block address translation, if the dcbz instruction is 
attempted to be executed either in write-through mode or as cache-inhibited (W = 1 or 
1=1), the alignment exception is generated. The checking of memory protection violation 
conditions for page address translation is described in Chapter 7, “Memory Management,” 
in The Programming Environments Manual for 32-bit implementations. 
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Generated 




Figure 5-8. Page Address Translation Flow for 32-Bit Implementations— TLB Hit 
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5.5 Page Table Search Operation 

As Stated earlier, the operating system must synthesize the table search algorithm for setting 
up the tables. In the case of the 603e, the TLB miss exception handlers also use this 
algorithm (with the assistance of some hardware-generated values) to load TLB entries 
when TLB misses occur as described in Section 5.5.2, “Table Search Operation with the 
PowerPC 603e Microprocessor.” 

5.5.1 Page Table Search Operation — Conceptual Flow 

The table search process for a PowerPC processor varies slightly for 64- and 32-bit 
implementations. The main differences are the address ranges and PTE formats specified. 
An outline of the page table search process performed by a 32-bit implementation (such as 
the 603e) is as follows: 

1. The 32-bit physical address of the primary PTEG is generated as described in 
Chapter 7, “Memory Management,” in The Programming Environments Manual for 
32-bit implementations. 

2. The first PTE (PTEO) in the primary PTEG is read from memory. PTE reads should 
occur with an implied WIM memory/cache mode control bit setting of ObOOl. 
Therefore, they are considered cacheable and burst in from memory and placed in 
the cache. 

3. The PTE in the selected PTEG is tested for a match with the virtual page number 
(VPN) of the access. The VPN is the VSID concatenated with the page index field 
of the virtual address. For a match to occur, the following must be true: 

— PTE[H] = 0 
— PTE[V] = 1 
— PTE[VSID] = VA[0-23] 

— PTE[API] = VA[24-29] 

4. If a match is not found, step 3 is repeated for each of the other seven PTEs in the 
primary PTEG. If a match is found, the table search process continues as described 
in step 8. If a match is not found within the eight PTEs of the primary PTEG, the 
address of the secondary PTEG is generated. 

5. The first PTE (PTEO) in the secondary PTEG is read from memory. Again, because 
PTE reads typically have a WIM bit combination of ObOOl, an entire cache line is 
burst into the on-chip cache. 

6. The PTE in the selected secondary PTEG is tested for a match with the virtual page 
number (VPN) of the access. For a match to occur, the following must be true: 

— PTE[H] = 1 
— PTE[V] = 1 
— PTE[VSID] = VA[0-23] 

— PTE[API] = VA[24-29] 
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7. If a match is not found, step 6 is repeated for each of the other seven PTEs in the 
secondary PTEG. 

8. If a match is found, the PTE is written into the on-chip TLB (if implemented, as in 
the 603e) and the R bit is updated in the PTE in memory (if necessary). If there is no 
memory protection violation, the C bit is also updated in memory and the table 
search is complete. 

9. If a match is not found within the eight PTEs of the secondary PTEG, the search 
fails, and a page fault exception condition occurs (either an ISI exception or a DSI 
exception). Note that the software routines that implement this algorithm for the 
603e must synthesize this condition by appropriately setting the bits in SRRl (or 
DSISR) and branching to the ISI or DSI handler routine. 

Reads from memory for table search operations should be performed as global (but not 
exclusive), cacheable operations, and can be loaded into the on-chip cache. 

Figure 5-9 and Figure 5-10 provide conceptual flow diagrams of primary and secondary 
page table search operations, respectively as described in the OEA for 32-bit processors. 
Recall that the architecture allows for implementations to perform the page table search 
operations automatically (in hardware) or software assist may be required, as is the case 
with the 603e. Also, the elements in the figure that apply to TLBs are shown as optional 
because TLBs are not required by the architecture. 

Figure 5-9 shows the case of a dcbz instruction that is executed with W = 1 or I = 1, and 
that the R bit may be updated in memory (if required) before the operation is performed or 
the alignment exception occurs. The R bit may also be updated in the case of a memory 
protection violation. 
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(See Figure 5-9) 
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Figure 5-10. Secondary Page Table Search Flow— Conceptual Flow 

5.5.2 Table Search Operation with the PowerPC 603e 
Microprocessor 

The 603e has a set of implementation-specific registers, exceptions, and instructions that 
facilitate very efficient software searching of the page tables in memory. This section 
describes those resources and provides three example code sequences that can be used in a 
603e system for an efficient search of the translation tables in software. These three code 
sequences can be used as handlers for the three exceptions requiring access to the PTEs in 
the page tables in memory: instruction TLB miss, data TLB miss on load, and data TLB 
miss on store exceptions. 
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5.5.2.1 Resources for Table Search Operations 

In addition to setting up the translation page tables in memory, the system software must 
assist the processor in loading PTEs into the on-chip TLBs. When a required TLB entry is 
not found in the appropriate TLB, the processor vectors to one of the three TLB miss 
exception handlers so that the software can perform a table search operation and load the 
TLB. When this occurs, the processor automatically saves information about the access and 
the executing context. Table 5-9 provides a summary of the implementation-specific 
exceptions, registers, and instructions, that can be used by the TLB miss exception handler 
software in 603e systems. Refer to Chapter 4, “Exceptions,” for more information about 
exception processing. 



Table 5-9. Implementation-Specific Resources for Table Search Operations 



Resource 


Name 


Description 


Exceptions 


Instruction TLB miss exception 
(vector offset 0x1 000) 


No matching entry found in ITLB 




Data TLB miss on load exception 
(vector offset 0x1100) 


No matching entry found in DTLB for a load 
data access 




Data TLB miss on store 
exception — ^aiso caused when 
changed bit must be updated 
(vector offset 0x1200) 


No matching entry found in DTLB for a store 
data access or matching DLTB entry has C = 0 
and access is a store. 


Registers 


IMISS and DMISS 


When a TLB miss exception occurs, the IMISS 
or DMISS register contains the 32-bit effective 
address of the instruction or data access that 
caused the miss exception. 




ICMP and DCMP 


The ICMP and DCMP registers contain the 
word to be compared with the first word of a 
PTE in the table search software routine to 
determine If a PTE contains the address 
translation for the instruction or data access. 
The contents of ICMP and DCMP are 
automatically derived by the 603e when a TLB 
miss exception occurs. 




HASH1 and HASH2 


The HASH1 and HASH2 registers contain the 
primary and secondary PTEG addresses that 
correspond to the address causing a TLB 
miss. These PTEG addresses are 
automatically derived by the 603e by 
performing the primary and secondary hashing 
function on the contents of IMISS or DMISS, 
for an ITLB or DTLB miss exception, 
respectively 




RPA 


The system software loads a TLB entry by 
loading the second word of the matching PTE 
entry into the RPA register and then executing 
the tibli or tibid instruction (for loading the 
ITLB or DTLB, respectively). 
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Table 5-9. Implementation-Specific Resources for Tabie Search Operations 



Resource 


Name 


Description 


instructions 


tibli rB 


Loads the contents of the ICMP and RPA 
registers into the ITLB entry selected by <ea> 
and SRR1[WAY] 


tibid rB 


Loads the contents of the DCMP and RPA 
registers into the DTLB entry selected by <ea> 
and SRR1[WAY] 



In addition, the 603e contains the following other features that don’t specifically control the 
603e MMU but that are implemented to increase performance and flexibility in the 
software table search routines whenever one of the three TLB miss exceptions occurs: 



• Temporary GPR0-GPR3. These registers are available as r0-r3 when MSR[TGPR] 
is set. The 603e automatically sets MSR[TGPR] for these cases, allowing these 
exception handlers to have four registers that are used as scratchpad space, without 
having to save or restore this part of the machine state that existed when the 
exception occurred. Note that MSR[TGPR] is cleared when the rfi instruction is 
executed because the old MSR value (with MSR[TGPR] = 0) saved in SRRl is 
restored. Refer to Section 5.5.2.2, “Software Table Search Operation,” for code 
examples that take advantage of these registers. 

• The 603e also automatically saves the values of CR[CR0] of the executing context 
to SRRl [0-3]. Thus, the exception handler can set CR[CR0] bits and branch 
accordingly in the exception handler routine, without having to save the existing 
CR[CR0] bits. However, the exception handler must restore these bits to CR[CR0] 
before executing the rfi instruction. 

• Also saved in SRRl are two bits identifying the type of miss (SRRl [D/I] identifies 
instruction or data, and SRR1[L/S] identifies a load or store). Additionally, 

SRRl [WAY] identifies the associativity class of the TLB entry selected for 
replacement by the LRU algorithm. The software can change this value, effectively 
overriding the replacement algorithm. Finally, the SRRl [KEY] bit is used by the 
table search software to determine if there is a protection violation associated with 
the access (useful on data write misses for determining if the C bit should be updated 
in the table). Table 5-10 summarizes the SRRl bits updated whenever one of the 
three TLB miss exceptions occurs. 
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Table 5-10. SRR1 Bits Specific to PowerPC 603e Processor 



Bit Number 


Name 


Function 


0-3 


CRFO 


Condition register field 0 bits 


12 


KEY 


Key for TLB miss (either Ks or Kp from segment register, 
depending on whether the access is a user or supervisor access) 


13 


D/I 


Set if instruction TLB miss 


14 


WAY 


Next TLB set to be replaced (set per LRU) 


15 


S/L 


Set if data TLB miss was for a load instruction 



The key bit saved in SRRl is derived as shown in Figure 5-11. 



Select KEY from segment register: 
If MSR[PR] = 0, KEY = Ks 
If MSR[PR] = 1,KEY = Kp 



Figure 5-11 . Derivation of Key Bit for SRRl 

The remainder of this section describes the format of the implementation-specific SPRs that 
are not defined by the PowerPC architecture, but are used by the TLB miss exception 
handlers. These registers can be accessed by supervisor-level instructions only. Any 
attempt to access these SPRs with user-level instructions results in a privileged instruction 
program exception. As DMISS, IMISS, DCMP, ICMP, HASHl, HASH2, and RPA are used 
to access the translation tables for software table search operations, they should only be 
accessed when address translation is disabled (that is, MSR[IR] = 0 and MSR[DR] = 0). 
Note that MSR[IR] and MSR[DR] are cleared by the processor whenever an exception 
occurs. 

5.5.2.1.1 Data and Instruction TLB Miss Address Registers (DMISS and 
IMISS) 

The DMISS and IMISS registers have the same format as shown in Figure 5-12. They are 
loaded automatically upon a data or instruction TLB miss. The DMISS and IMISS contain 
the effective page address of the access which caused the TLB miss exception. The contents 
are used by the processor when calculating the values of HASHl and HASH2, and by the 
tlbld and tlbli instructions when loading a new TLB entry. Note that the 603e always loads 
a big-endian address into the DMISS register. These registers are read-only to the software. 



Effective Page Address 

0 31 

Figure 5-12. DMISS and IMISS Registers 
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5.5.2. 1.2 Data and Instruction TLB Compare Registers (DCMP and ICMP) 

The DCMP and ICMP registers are shown in Figure 5-13. These registers contain the first 
word in the required PTE. The contents are constructed automatically from the contents of 
the segment registers and the effective address (DMISS or IMISS) when a TLB miss 
exception occurs. Each PTE read from the tables in memory during the table search process 
should be compared with this value to determine whether or not the PTE is a match. Upon 
execution of a tlbld or tibli instruction, the contents of the DCMP or ICMP register is 
loaded into the first word of the selected TLB entry. 



V 


VSID H API 


1 


0 1 


24 25 26 

Figure 5*13. DCMP and ICMP Registers 


31 



Table 5-11 describes the bit settings for the DCMP and ICMP registers. 

Table 5-11. DCMP and ICMP Bit Settings 



Bits 


Name 


Description 


0 


V 


Valid bit. Set by the processor on a TLB miss exception. 


1-24 


VSID 


Virtual segment ID. Copied from VSID field of corresponding segment register. 


25 


H 


Hash function identifier. Cleared by the processor on a TLB miss exception 


26-31 


API 


Abbreviated page index. Copied from API of effective address. 



5.5.2.1.3 Primary and Secondary Hash Address Registers (HASH1 and 
HASH2) 

HASHl and HASH2 contain the physical addresses of the primary and secondary PTEGs 
for the access that caused the TLB miss exception. Only bits 7-25 differ between them. For 
convenience, the processor automatically constructs the full physical address by routing 
bits 0-6 of SDRl into HASHl and HASH2 and clearing the lower six bits. These registers 
are read-only and are constructed from the contents of the DMISS or IMISS register. The 
format for the HASHl and HASH2 registers is shown in Figure 5-14. 



n Reserved 



HTABORG 



Hashed Page Address 



000000 



0 



6 7 



25 26 



31 



Figure 5-14. HASHl and HASH2 Registers 
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Table 5-12 describes the bit settings of the HASH ! and HASH2 registers. 

Table 5-12. HASH1 and HASH2 Bit Settings 



Bits 


Name 


Description 


0-6 


HTABORGIO-6] 


Copy of the upper 7 bits of the HtABORG field from SDR1 


7-25 


Hashed page address 


Address bits 7-25 of the PTEG to be searched. 


26-31 


— 


Reserved 



5.5.2.1.4 Required Physical Address Register (RPA) 

The RPA is shown in Figure 5-15. During a page table search operation, the software must 
load the RPA with the second word of the correct PTE. When the tlbld or tlbli instruction 
is executed, data from the IMISS and ICMP (or DMISS and DCMP) and the RPA registers 
is merged and loaded into the selected TLB entry. The TLB entry is selected by the effective 
address of the access (loaded by the table search software from the DMISS or IMISS 
register) and the SRR1[WAY] bit. 

M Reserved 



RPN 




D 


B 


WIMG 




B 






Si 


M 






1 



0 19 20 22 23 24 25 28 29 30 31 



Figure 5-15. Required Physical Address (RPA) Register 

Table 5-13 describes the bit settings of the RPA register. 

Table 5r13. RPA Bit Settings 



Bits 


Name 


Description 


0-19 


RPN 


Physical page number from PTE 


20-22 


— 


Reserved 


23 


R 


Referenced bit from PTE 


24 


C 


Changed bit from PTE 


25-28 


WIMG 


Memory/cache access attribute bits 


29 


— 


Reserved 


30-31 


PP 


Page protection bits from PTE 



5.S.2.2 Software Table Search Operation 

When a TLB miss occurs, the instruction or data MMU loads the IMISS or DMISS register, 
respectively, with the effective address of the access. The processor completes all 
instructions dispatched prior to the exception, status information is saved in SRRl, and one 
of the three TLB miss exceptions is taken. In addition, the processor loads the ICMP or 
DCMP register with the value to be compared with the first word of PTEs in the tables in 
memory. 
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The software should then access the first PTE at the address pointed to by HASHl . The first 
word of the PTE should be loaded and compared to the contents of DCMP or ICMP. If there 
is a match, then the required PTE has been found and the second word of the PTE is loaded 
from memory into the RPA register. Then the tibli or tibid instruction is executed, which 
loads the contents of the ICMP (or DCMP) and RPA registers into the selected TLB entry. 
The TLB entry is selected by the effective address of the access and the SRR1[WAY] bit. 

If the compare did not result in a match, however, the PTEG address is incremented to point 
to the next PTE in the table and the above sequence is repeated. If none of the eight PTEs 
in the primary PTEG matches, the sequence is then repeated using the secondary PTEG (at 
the address contained in HASH2). 

If the PTE is also not found in the eight entries of the secondary page table, a page fault 
condition exists, and a page fault exception must be synthesized. Thus the appropriate bits 
must be set in SRRl (or DSISR) and the TLB miss handler must branch to either the ISI or 
DSI exception handler, which handles the page fault condition. 

This section provides a flow diagram outlining some example software that can be used to 
handle the three TLB miss exceptions, as well as some example assembly language that 
implements that flow. 

5.5.2.2.1 Flow for Example Exception Handlers 

Figure 5-16 shows the flow for the example TLB miss exception handlers. The flow shown 
is common for the three exception handlers, except that the IMISS and ICMP registers are 
used for the instruction TLB miss exception while the DMISS and DCMP registers are used 
for the two data TLB miss exceptions. Also, for the cases of store instructions that cause 
either a TLB miss or require a table search operation to update the C bit, the flow shows 
that the C bit is set in both the TLB entry and the PTE in memory. Finally, in the case of a 
page fault (no PTE found in the table search operation), the setup for the ISI or DSI 
exception is slightly different for these two cases. 

Figure 5-17 shows the flow for checking the R and C bits and setting them appropriately. 
Figure 5-18 shows the flow for synthesizing a page fault exception when no PTE is found. 
Figure 5-19 shows the flow for managing the cases of a TLB miss on an instruction access 
to guarded memory, and a TLB miss when C = 0 and a protection violation exists. The set 
up for these protection violation exceptions is very similar to that of page fault conditions 
(as shown in Figure 5-18) except that different bits in SRRl (and DSISR) are set. 
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Figure 5-16. Flow for Example Software Table Search Operation 
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Check R, C bits 
, and set as needed ^ 



handler for data store op 



Check 

protection PP = 'I0 
^11 



temp[C] = 0 



Set up for 

protection violation^ 
(See Figure 5-1 9) 




SRR1[KEY1 = 1 



Set up for ^ 
protection violation^ 

(See Figure 5-19) 



Set R bit: 

temp «- temp OR 0x100 

T 

Store byte 7 of PTE to memory: 
(ptr - 2) <- temp [byte7] 



'Return to TLB Miss 
Exception flow ^ 

(See Figure 5-16) 



Set R, C bits: 
temp <r- temp OR 0x180 



Store bytes 6, 7 of PTE to memory: 
(ptr - 2) <- temp [bytes 6, 7] 



'Return to TLB Miss 
Exception flow ^ 

(See Figure 5-16) 



Figure 5-17. Check and Set R and C Bit Flow 
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Figure 5-18. Page Fault Setup Flow 
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( Set up for protectionN 
violation exceptions J 




Figure 5-19. Setup for Protection Violation Exceptions 



S.5.2.2.2 Code for Example Exception Handlers 

This section provides some assembly language examples that implement the flow diagrams 
described above. Note that although these routines fit into a few cache lines, they are 
supplied only as a functional example; they could be further optimized for faster 
performance. 



# TLB software load for 603e 

# 

# New Instructions: 

# tlbld 

# tlbli 
#New SPRs 

# dmiss 

# imiss 



- write the dtlb with the pte in rpa reg 

- write the itlb with the pte in rpa reg 

- address of dstream miss 

- address of istream miss 
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# 


hashl 


- address primary hash PTEG address 


# 


hash2 


- returns secondary hash PTEG address 


# 


iCmp 


- returns the primary istream compare value 


# 


dCmp 


- returns the primary dstream compare value 


# 


rpa 


- the second word of pte used by tlblx 



# 

# gpr lO. s3 are shadowed 

# 

# there are three flows. 

# tlbDataMiss- tlb miss on data load 

# tlbCeqO - tlb miss on data store or store with tlb change bit == 0 

# tlblnstrMiss- tlb miss on instruction fetch 
#+ 

# place lables for rel branches 
#- 



#.machine PPC_603e 


.set 


r0,0 


.set 


rl, 1 


.set 


r2,2 


.set 


r3,3 


.set 


dMiss, 1010 


.set 


dCmp, 1011 


.set 


hashl, 1012 


.set 


hash2, 1013 


.set 


iMiss, 1014 


.set 


iCmp, 1015 


.set 


rpa, 1010 


.set 


c0,0 


.set 


dar, 19 


.set 


dsisr, 18 


.set 


srrO, 26 


.set 


srrl, 27 


.csect tlbmiss[PR] 
vecO: 

.globl vecO 


.org 

vec300: 


vec0+0x300 


.org 

vec400; 

#+ 


vec0+0x400 


# Instruction TB miss flow 

# Entry: 



# Vec = 1000 

# srrO -> address of instruction that missed 

# srrl -> 0:3=ci0 4=lru way bit 16:31 = saved MSR 

# msr<tgpr> -> 1 

# iMiss -> ea that missed 

# iCmp -> the compare value for the va that missed 

# hashl -> pointer to first hash pteg 
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# hash! -> pointer to second hash pteg 

# 

# Register usage: 

# rO is saved counter 

# rl is junk 

# r2 is pointer to pteg 

# r3 is current compare value 

.org vecO+OxlOOO 
tlblnstrMiss: 



mfspr 


r2, hashl 


# get first pointer 


addi 


rl,0,8 


# load 8 for counter 


mfctr 


lO 


# save counter 


mfspr 


r3, iCmp 


# get first compare value 


addi 


r2, r2, -8 


# pre dec the pointer 


mtctr 


rl 


# load counter 


Iwzu 


rl, 8(r2) 


# get next pte 


cmp 


cO, rl, r3 


# see if found pte 


bdneq 


iml 


# dec count br if cmp ne and if count not zero 


bne 


instrSecHash# if not found set up second hash or exit 


1 


rl, +4(r2) 


# load tlb entry lower-word 


andi. 


r3,rl,8 


# check G-bit 


bne 


doISIp 


# if guarded, take an ISI 


mtctr 


rO 


# restore counter 


mfspr 


rO, iMiss 


# get the miss address for the tlbli 


mfspr 


r3, srrl 


# get the saved crO bits 


mtcrf 


0x80, r3 


# restore CRO 


mtspr 


rpa, rl 


# set the pte 


ori 


rl, rl, 0x100# set reference bit 


srw 


rl,rl,8 


# get byte 7 of pte 


tlbli 


lO 


# load the itlb 


stb 


rl, +6(r2) 


# update page table 


rfi 




# return to executing program 



#+ 

# Register usage: 

# rO is saved counter 

# rl is junk 

# r2 is pointer to pteg 

# r3 is current compare value 

#- 

instrSecHash: 

andi. rl , r3, 0x0040# see if we have done second hash 

bne doISI # if so, go to ISI exception 

mfspr r2, hash2 # get the second pointer 

ori r3, r3, 0x0040# change the compare value 

addi rl, 0, 8 # load 8 for counter 

addi r2, r2, -8 # pre dec for update on load 

b imO # try second hash 

#+ 

# entry Not Found: synthesize an ISI exception 
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# guarded memory protection violation: synthesize an ISI exception 




# Entry: 








# 


rO is saved counter 




# 


rl is junk 






# 


r2 is pointer to pteg 




# 


r3 is current compare value 




# 

doISIp: 


mfspr 


r3, srrl # get srrl 






andi. 


r2,r3,0xffff # clean upper srrl 






addis 


r2, r2, 0x0800# or in srr<4> = 1 to flag prot violation 






b 


isil: 




doISI: 


mfspr 


r3, srrl # get srrl 






andi. 


r2, r3, Oxffff# clean srrl 




1 


addis 


r2, i2, 0x4000# or in srrl<l> = 1 to flag pte not found 


1 


mtctr 


rO # restore counter 




isil 


mtspr 


srrl,i2 # set srrl 






mfmsr 


rO # get msr 






xori 


rO, lO, 0x8000# flip the msr<tgpr> bit 






mtcrf 


0x80, r3 # restore CRO 






mtmsr 


rO # flip back to the native gprs 






b 


vec400 # go to instr. access exception 




# 








#+ 








# Data TLB miss flow 






# Entry: 








# 


Vec = 1100 




# 


srrO 


-> address of instruction that caused data tlb miss 




# 


srrl 


-> 0:3=cr0 4=lru way bit 5=1 if store 16:31 = saved MSR 




# 


msr<tgpr> -> 1 




# 


dMiss 


-> ea that missed 




# 


dCmp 


-> the compare value for the va that missed 




# 


hashl 


-> pointer to first hash pteg 




# 


hash2 


-> pointer to second hash pteg 




# 








# Register usage: 






# 


rO is saved counter 




# 


rl is junk 






# 


r2 is pointer to pteg 




# 


r3 is current compare value 




#- 








.csect 


tlbmiss[PR] 




org 


vecO+OxllOO 




tlbDataMiss: 








mfspr 


r2, hashl # get first pointer 






addi 


rl , 0, 8 # load 8 for counter 






mfctr 


rO # save counter 
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mfspr 


r3, dCmp # get first compare value 






addi 


r2, r2, -8 # pre dec the pointer 




dmO: 


mtctr 


rl # load counter 




dml: 


Iwzu 


rl, 8(r2) # get next pte 






cmp 


cO, rl , r3 # see if found pte 






bdneq 


dml # dec count br if cmp ne and if count not zero 






bne 


dataSecHash# if not found set up second hash or exit 






1 


rl, +4(r2) # load tlb entry lower-word 






mtctr 


rO # restore counter 






mfspr 


lO, dMiss # get the miss address for the tlbld 






mfspr 


r3, srrl # get the saved crO bits 






mtcrf 


0x80, r3 # restore CRO 






mtspr 


rpa, rl # set the pte 






ori 


rl, rl, 0x100# set reference bit 






srw 


rl,rl,8 # get byte 7 of pte 






tlbld 


lO # load the dtlb 






stb 


rl , +6(r2) # update page table 






rfi 


# return to executing program 




#+ 








# Register usage: 






# 


rO is saved counter 




# 


rl is junk 






# 


r2 is pointer to pteg 




# 


r3 is current compare value 




#- 








dataSecHash: 








andi. 


rl, r3, 0x0040# see if we have done second hash 






bne 


doDSI # if so, go to DSI exception 






mfspr 


r2, hash2 # get the second pointer 






ori 


r3, r3, 0x0040# change the compare value 






addi 


rl, 0, 8 # load 8 for counter 






addi 


r2, r2, -8 # pre dec for update on load 




# 


b 


dmO # try second hash 




#+ 








# C=0 in dtlb and dtlb miss on store flow 




# Entry: 








# 


Vec = 1200 




# 


srrO 


-> address of store that caused the exception 




# 


srrl 


-> 0:3=ci0 4=lru way bit 5=1 16:31 = saved MSR 




# 


msr<tgpr> 


->1 




# 


dMiss 


-> ea that missed 




# 


dCmp 


-> the compare value for the va that missed 




# 


hashl 


-> pointer to first hash pteg 




# 


hash2 


-> pointer to second hash pteg 




# Register usage: 






# 


rO is saved counter 




# 


rl is junk 






# 


r2 is pointer to pteg 
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# 

#- 


r3 is current compare value 


.csect 


tlbmiss[PR] 




.org 


vec0-i-0xl200 




tIbCeqO: 


mfspr 


r2, hashl 


# get first pointer 




addi 


rl,0,8 


# load 8 for counter 




mfctr 


rO 


# save counter 




mfspr 


r3, dCmp 


# get first compare value 




addi 


i2, r2, -8 


# pre dec the pointer 


ceqO: 


mtctr 


rl 


# load counter 


ceql : 


Iwzu 


rl, 8(r2) 


# get next pte 




cmp 


cO, rl, r3 


# see if found pte 




bdneq 


ceql 


# dec count br if cmp ne and if count not zero 




bne 


cEqOSecHash# if not found set up second hash or exit 




1 


rl, +4(r2) 


# load tlb entry lower-word 




andi. 


r3,rl,0x80 


# check the C-bit 




beq 


cEqOChkProt# if (C==0) go check protection modes 


ceq2: 


mtctr 


rO 


# restore counter 




mfspr 


rO, dMiss 


# get the miss address for the tlbld 




mfspr 


r3, srrl 


# get the saved crO bits 




mtcrf 


0x80, r3 


# restore CRO 




mtspr 


rpa, rl 


# set the pte 




tlbld 


rO 


# load the dtlb 




rfi 




# return to executing program 



#+ 

# Register usage: 

# rO is saved counter 

# rl is junk 

# r2 is pointer to pteg 

# r3 is current compare value 

#- 

cEqOSecHash: 

andi. rl , r3, 0x0040# see if we have done second hash 

bne doDSI # if so, go to DSI exception 

mfspr r2, hash2 # get the second pointer 

ori r3, r3, 0x0040# change the compare value 

addi rl, 0, 8 # load 8 for counter 

addi r2, r2, -8 # pre dec for update on load 

b ceqO # try second hash 

#+ 

# entry found and PTE(c-bit==0): 

# (check protection before setting PTE(c-bit) 

# Register usage: 

# rO is saved counter 

# rl is PTE entry 

# r2 is pointer to pteg 

# r3 is trashed 

#- 
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cEqOChkProt: 








rlwinm. 


r3,rl,30,0,l#test PP 




bge- 


chkO 


# if (PP=00 or PP==01) goto chkO: 




andi. 


r3,rl,l 


# test PP[0] 




beq+ 


chk2 


# return ifPP[0]==0 




b 


doDSIp 


# else DSIp 


chkO: 


mfspr 


r3,srrl 


# get old msr 




andis. 


r3,r3,0x0008# test the KEY bit (SRRO-bit 12) 




beq 


chk2 


#if(KEY==0)gotochk2: 




b 


doDSIp 


# else DSIp 


chk2: 


ori 


rl, rl, 0x180# set reference and change bit 




sth 


rl, -2(r2) 


# update page table 


u 


b 


ceq2 


# and back we go 


IT 

#+ 









# entry Not Found: synthesize a DSI exception 

# Entry: 

# rO is saved counter 

# rl is junk 

# r2 is pointer to pteg 

# r3 is current compare value 

# 



doDSI: 

mfspr 

rlwinm 

addis 

b 

doDSIp: 

mfspr 

rlwinm 

addis 

dsil: mtctr 

andi, 
mtspr 
mtspr 
mfspr 
rlwinm. 
bne 
xor 

dsi2: mtspr 

mfmsr 
xoris 
mtcrf 
mtmsr 
b 



r3, srrl # get srrl 

rl, r3, 9,6,6# get srrl<flag> to bit 6 for load/store, zero rest 
rl, rl, 0x4000# or in dsisr<l> = 1 to flag pte not found 
dsil: 

r3, srrl # get srrl 

rl, r3, 9,6,6# get srrl<flag> to bit 6 for load/store, zero rest 

rl, rl, 0x0800# or in dsisr<4> = 1 to flag prot violation 

rO # restore counter 

r2, r3, Oxffff# clear upper bits of srrl 

srrl,r2 # set srrl 

dsisr, rl # load the dsisr 

rl , dMiss # get miss address 

r2,r2,0,31,31# test LE bit 

dsi2: # if little endian then: 

rl,rl,0x07 # de-mung the data address 

dar, rl # put in dar 

rO # get msr 

rO, rO, 0x2 # flip the msr<tgpr> bit 

0x80, r3 # restore CRO 

lO # flip back to the native gprs 

vec300 # branch to DSI exception 
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5.5.3 Page Table Updates 

When TLBs are implemented (as in the 603e) they are defined as noncoherent caches of the 
page tables. TLB entries must be flushed explicitly with the TLB invalidate entry 
instruction (tlbie) whenever the corresponding PTE is modified. As the 603e is intended 
primarily for uniprocessor environments, it does not provide coherency of TLBs between 
multiple processors. If the 603e is used in a multiprocessor environment where TLB 
coherency is required, all synchronization must be implemented in software. 

Processors may write referenced and changed bits with unsynchronized, atomic byte store 
operations. Note that the V, R, and C bits each resides in a distinct byte of a PTE. Therefore, 
extreme care must be taken to use byte writes when updating only one of these bits. 

Explicitly altering certain MSR bits (using the mtmsr instruction), or explicitly altering 
PTEs, or certain system registers, may have the side effect of changing the effective or 
physical addresses from which the current instruction stream is being jfetched. This kind of 
side effect is defined as an implicit branch. Implicit branches are not supported and an 
attempt to perform one causes boundedly-undefined results. Therefore, PTEs must not be 
changed in a manner that causes an implicit branch. Chapter 2, “PowerPC Register Set,” in 
The Programming Environments Manual, lists the possible implicit branch conditions that 
can occur when system registers and MSR bits are changed. 

5.5.4 Segment Register Updates 

There are certain synchronization requirements for using the move to segment register 
instructions. These are described in “Synchronization Requirements for Special Registers 
and for Lookaside Buffers” in Chapter 2, “PowerPC Register Set,” in The Programming 
Environments Manual. 
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Chapter 6 
Instruction Timing 

This chapter describes instruction prefetch and execution through all of the execution units 
of the PowerPC 603e microprocessor. It also provides examples of instruction sequences 
showing concurrent execution and various register dependencies to illustrate timing 
interactions. Bus signals described in this chapter are only accurate to within half clock 
cycle increments. See Chapters, “System Interface Operation,” for more specific 
information regarding bus operation timing. Instruction mnemonics used in this chapter can 
be identified by referring to Chapter 8, “Instruction Set,” in The Programming 
Environments Manual 

6.1 Terminology and Conventions 

This section describes terminology and conventions used in this chapter. 

• Branch prediction — ^The process of guessing whether a branch will be taken. Such 
predictions can be correct or incorrect; the term predicted as it is used here does not 
imply that the prediction is correct (successful). The PowerPC architecture defines 
a means for static branch prediction, which is part of the instruction encoding. 

• Branch resolution — ^The determination of whether a branch is taken or not taken. A 
branch is said to be resolved when it can exactly be determined which path it will 
take. If the branch is resolved as predicted, the instructions following the predicted 
branch can be completed. If the branch is not resolved as predicted, instructions on 
the mispredicted path are purged from the instruction pipeline and are replaced with 
the instructions from the nonpredicted path. 

• Completion — Completion occurs when an instruction is removed from the 
completion buffer. When an instruction completes we can be sure that this 
instruction and all previous instructions will cause no exceptions. In some situations, 
an instruction can finish and complete in the same cycle. 

• Finish — ^The term indicates the final cycle of execution. In this cycle, the completion 
buffer is updated to indicate that the instruction has finished executing. 

• Latency — The number of clock cycles necessary to execute an instruction and make 
ready the results of that execution for a subsequent instruction. 
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• Pipeline — In the context of instruction timing, the term pipeline refers to the 
interconnection of the stages. The events necessary to process an instruction are 
broken into several cycle-length tasks to allow work to be performed on several 
instructions simultaneously — analogous to an assembly line. As an instruction is 
processed, it passes from one stage to the next. When it does, the stage becomes 
available for the next instruction. 

Although an individual instruction may take many cycles to complete (the number 
of cycles is called instruction latency), pipelining makes it possible to overlap the 
processing so that the throughput (number of instructions completed per cycle) is 
greater than if pipelining were not implemented. 

• Program order— The original order in which program instructions are provided to 
the instruction queue from the cache. 

• Rename buffer — ^Temporary buffers used by instructions that have not completed 
and as write-back buffers for those that have. 

• Reservation station — ^A buffer between the dispatch and execute stages that allows 
instructions to be dispatched even though the operands required for execution may 
not yet be available. 

• Stage — ^An element in the pipeline at which certain actions are performed, such as 
decoding the instruction, performing an arithmetic operation, and writing back the 
results. A stage typically takes a cycle to perform its operation; however, some 
stages are repeated (a double-precision floating-point multiply, for example). When 
this occurs, an instruction immediately following it in the pipeline is forced to stall 
in its cycle. 

In some cases, an instruction may also occupy more than one stage 
simultaneously — for example, instructions may complete and write back their 
results in the same cycle. 

After an instruction is fetched, it can always be defined as being in one or more 
stages. 

• Stall — ^An occurrence when an instruction cannot proceed to the next stage. 

• Superscalar — ^A superscalar processor is one that can issue multiple instructions 
concurrently from a conventional linear instruction stream. In a superscalar 
implementation, multiple instructions can be in the same stage at the same time. 

• Throughput— A measure of the number of instructions that are processed per cycle. 
For example, a series of double-precision floating-point multiply instructions has a 
throughput of one instruction per clock cycle. 

• Write-back — ^Write-back (in the context of instruction handling) occurs when a 
result is written from the rename registers into the architectural registers (typically 
the GPRs and FPRs). Results are written back at completion time or are moved into 
the write-back buffer. Results in the write-back buffer cannot be flushed. If an 
exception occurs, these buffers must write back before the exception is taken. 
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6.2 Instruction Timing Overview 

The 603e has been designed to minimize average instruction execution latency. Latency is 
defined as the number of clock cycles necessary to execute an instruction and make ready 
the results of that execution for a subsequent instruction. For many of the instructions in the 
603e, this can be simplified to include only the execute phase for a particular instruction. 
However, data access instructions require additional clock cycles between the execute 
phase and the write-back phase due to memory latencies. 

In accordance with this definition, logical, bit-field, and most integer instructions have a 
latency of one clock cycle (for example, results for these instructions are ready for use on 
the next clock cycle after issue). Other instructions, such as the integer multiply, require 
more than one clock cycle to complete execution. 

Effective throughput of more than one instruction per clock cycle can be realized by the 
many performance features in the 603e including pipelining, superscalar instruction issue, 
branch acceleration, and multiple execution units that operate independently and in 
parallel. 

The load/store and floating-point units on the 603e are pipelined, which means that the 
execution units are broken into stages. Each stage performs a specific step, which 
contributes to the overall execution of an instruction. The pipelined design is analogous to 
an assembly line where workers perform a specific task and pass the partially complete 
product to the next worker. 

When an instruction is issued to a pipelined execution unit, the first stage in the pipeline 
begins its designated work on that instruction. As an instruction is passed from one stage in 
the pipeline to the next, evacuated stages may accept new instructions. This design allows 
a single execution unit to be working on several different instructions simultaneously. 
While it may take several cycles for a given instruction to propagate through the execution 
pipeline, once the pipeline has been filled with instructions the execution unit is capable of 
completing an instruction every clock. 

Figure 6-1 shows a graphical representation of a generic pipelined execution unit. 



CLOCK 0 
CLOCK 1 
CLOCK 2 
CLOCK 3 




Figure 6-1. Pipelined Execution Unit 
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If the number of stages in each pipeline is equal to the total latency in clock cycles of its 
respective execution unit, the processor can continuously issue instructions to the same 
execution unit without stalling. Thus, when enough instructions have been issued to an 
execution unit to fill its pipeline, the first instruction will have completed execution and 
exited the pipeline, allowing subsequent instructions to be issued into the tail of the pipeline 
without interruption. 

The 603e’s completion buffer is capable of retiring two instructions on every clock cycle. 
In general, instruction processing is accomplished in five stages— the fetch stage, the 
decode stage, the execute stage, the completion stage, and the write-back stage. The 
instruction fetch stage includes the clock cycles necessary to request instructions from the 
on-chip cache as well as the time it takes the on-chip cache to respond to that request. The 
decode stage consists of the time it takes to fully decode the instruction. In the completion 
stage, two instructions per cycle are completed in program order. In the write-back stage, 
results are returned to the register file. Instructions are fetched and executed concurrently 
with the execution and write back of previous instructions producing an overlap period 
between instructions. The details of these operations are explained in the following 
paragraphs. 

6.3 Timing Considerations 

A superscalar machine is one that can issue multiple instructions concurrently from a 
conventional linear instruction stream. The 603e is a true superscalar implementation of the 
PowerPC architecture since a maximum of three instructions can be issued to the execution 
units (one branch instruction to the branch processing unit, and two instructions issued 
from the dispatch queue to the other execution units) during each clock cycle. Although a 
superscalar implementation complicates instruction timing, these complications are 
transparent to the software. While the 603e appears to the programmer to execute 
instructions in sequential order, the 603e provides increased performance by executing 
multiple instructions at a time, and using hardware to manage dependencies. 

When an instruction is issued, the register file places the appropriate source data on the 
appropriate source bus. The corresponding execution unit then reads the data from the bus. 
The register files and source buses have sufficient bandwidth to allow the dispatching of 
two instructions per clock. 

The 603e contains the following execution units that operate independently and in parallel: 

• Branch processing unit (BPU) 

• 32-bit integer unit (lU) 

• 64-bit floating-point unit (FPU) 

• Load/store unit (LSU) 

• System register unit (SRU) 

The 603e’s branch processing unit decodes and executes branches immediately after they 
are fetched. The resources of the branch unit include — a count register (CTR) rename 
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register for mtspr(CTR), a link register (LR) rename register for mtspr(LR), a link register 
(LR) rename register for branches specifying an update of the link register, and a branch 
reservation station for conditional branches that cannot be resolved due to a CR-data 
dependency. 

When a conditional branch cannot be resolved due to a CR-data dependency, the branch 
direction is predicted and execution commences down the predicted path. If the branch 
resolves as incorrectly guessed, then: 1) the instruction buffer is purged and fetching of the 
correct path commences, 2) any instructions executed prior to the predicted branch in the 
completion buffer are allowed to “complete”, 3) all instructions executed subsequent to the 
mispredicted branch are purged from the machine, and 4) dispatching down the correct path 
commences. 

When the lU, SRU, or FPU finishes executing an instruction, it places the resulting data, if 
any, into one of the general-purpose register (GPR) or floating-point register (FPR) rename 
registers. The results are then stored into the correct GPR during the write-back stage. If a 
subsequent instruction is waiting for this data, it is forwarded past the register file, directly 
into the appropriate execution unit for the immediate execution of the waiting instruction. 
This allows a data-dependent instruction to be decoded without waiting for the data to be 
written into the register file and then read back out again. This feature, known as feed 
forwarding, significantly shortens the time the machine may stall on data dependencies. 

6.3.1 General Instruction Flow 

Instructions are fetched from the instruction cache at a peak rate of two per cycle, and 
placed in either the instruction queue (IQ) or the BPU. Instructions enter the IQ and are 
issued to the various execution units from the dispatch queue. The IQ is a six-entry queue, 
which is the backbone of the master pipeline for the microprocessor. The 603e tries to keep 
the IQ full at all times. Although two instructions can be brought in from the on-chip cache 
in a single clock cycle, if there is a one-instruction vacancy in the IQ, one instruction will 
be fetched from the cache to fill it. If while topping off the IQ, the request for new 
instructions misses in the on-chip cache, then arbitration for a memory access will begin. 

Instructions enter the IQ through entry 5 and filter down to be Issued from queue entry 1 
orO. The fetch bus between the IQ and the on-chip cache is wide enough for two 
instructions to be brought into the IQ simultaneously, which matches the dispatcher’s 
ability to issue two instructions per cycle. 

Branch instructions are identified by the fetcher, and forwarded to the BPU directly, 
bypassing the dispatch queue. The branch is either executed and resolved (if the branch is 
unconditional or if required conditions are available), or is predicted. Once a branch 
instruction has been executed, it may need to update a special-purpose register. In that case, 
the branch instruction will do its write back sometime after the decode/execute phase. If no 
write back is needed, the branch instruction is retired. All other instructions are issued from 
the dispatch queue, with dispatch rate contingent on execution unit busy status, rename and 
completion buffer availability, and the serializing behavior of some instructions. 
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Instruction dispatch is done in program order, and if the instruction in queue entry 0 is 
unable to be dispatched, it will inhibit the instruction in queue entry 1 from being issued. 

Figure 6-2 reflects the organization of the 603e, and the paths taken by instructions issued 
from the instruction queue and how those instructions progress through the various 
execution units. 



Fetch 

Branch 

Processing Unit 





Figure 6-2. Instruction Fiow Diagram 
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6.3.2 Instruction Fetch Timing 

The timing of the instruction fetch mechanism on the 603e depends heavily on the state of 
the on-chip cache. The speed with which the required instruction is returned to the fetcher 
depends on whether the instruction being asked for is in the on-chip cache (cache hit) or 
whether a memory transaction is required to bring the data into the cache (cache miss). 

These issues are discussed further in the following sections. 

6.3.2.1 Cache Arbitration 

When the instruction fetcher attempts to fetch instructions from the on-chip cache, the 
cache may or may not be able to immediately respond to the request. There are two 
scenarios that may be encountered by the instruction fetcher when it requests instructions 
from the on-chip cache. 

The first scenario is when the on-chip cache is idle and a request comes in from the 
instruction fetcher for additional instructions. In this case, the on-chip cache responds with 
the requested instructions on the next clock cycle. 

The second scenario occurs if at the time the instruction fetcher requests instructions, the 
on-chip cache is busy due to a cache-line-reload operation. When this case arises, the on- 
chip cache will be inaccessible until the reload operation is complete. 

6.3.2.2 Cache Hit 

Assuming that the instruction fetcher is not blocked from the cache by a cache-reload 
operation and the instructions it needs are in the on-chip cache (a cache hit has occurred), 
there will be only one clock cycle between the time that the instruction fetcher requests the 
instructions and the time that the instructions enter the IQ. As previously stated, two 
instructions can be simultaneously fetched from the on-chip cache and loaded into the IQ. 

Figure 6-3 shows a brief example of instruction fetching that hits in the on-chip cache. In 
this example, two instructions are fetched into the IQ during clock cycle 0. During clock 
cycle 1, instructions 0 and 1 are dispatched to the integer and floating-point execution units. 
During clock cycle 2, a branch instruction is fetched into the branch processing unit. The 
BPU is immediately able to determine that the branch will indeed change program flow and 
sends a request to the on-chip cache for the new instruction stream. 

During clock cycle 4, the new instructions arrive in the IQ. In clock cycle 5, one integer 
instruction is dispatched to the integer unit, and the following instruction (also an integer 
instruction) is blocked from dispatch until clock cycle 6. Instructions fetched in clock 
cycle 5 are held in the IQ until the dispatch queue is cleared on the next cycle. As the IQ is 
emptied into the individual execution units, additional instructions will be requested from 
the on-chip cache. 
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Figure 6-3. Instruction Timing — Cache Hit 



6.3.2.S Cache Miss 

Figure 6-4 shows a brief example of an instruction fetch that misses in the on-chip cache 
and how that fetch affects the instruction issue. Note that the processor/bus clock ratio is 
1:1 in this example. 

In this example, two instructions are fetched into the IQ during clock cycle 0. During clock 
cycle 1, instructions 0 and 1 are dispatched to the integer and floating-point execution units. 
During clock cycle 2, a branch instruction is fetched into the branch processing unit. The 
BPU is immediately able to determine that the branch will indeed change program flow and 
sends a request to the on-chip cache for the new instruction stream. 
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Figure 6-4. Instruction Timing — Cache Miss 

During clock cycle 3, the on-chip cache misses the access and determines that a memory 
access will have to occur. During clock cycle 5, the address of the block of instructions is 
applied to the system bus. During clock cycle 7, two instructions (64 bits) are returned from 
memory, and are forwarded to the cache and the instruction fetcher. In subsequent clock 
cycles, one integer and one floating-point instruction is dispatched to their respective 
execution units. Instructions are forwarded to the instruction fetcher and the cache until the 
cache line reload is completed in cycle 10. 

6.3.3 Instruction Dispatch and Completion Considerations 

Several factors affect the 603e’s ability to dispatch instructions at a peak rate of two per 
cycle. These factors include execution unit availability, destination rename register 
availability, completion buffer availability, and the handling of dispatch-serialized 
instructions. 
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To avoid dispatch unit stalls due to instruction data dependencies, the 603e provides a 
reservation station for each execution unit. If a data dependency exists that may preclude 
an instruction from beginning execution, that instruction will be dispatched to the 
reservation station associated with its execution unit, thereby clearing the dispatch unit. 
When the data that the operation depends upon is returned via a cache access or as a result 
of a previous operation, execution will begin during the same clock cycle that the register 
file is being updated. If the second instruction in the dispatch unit requires the same 
execution unit, dispatch of that instruction will stall until the first instruction completes 
execution. 

The completion unit provides a mechanism to track instructions from dispatch through 
execution, and then retire or “complete” them in program order. Completing an instruction 
implies the commitment of the results of instruction execution to the architected registers. 
In-order completion ensures the correct architectural state when the 603e must recover 
from a mispredicted branch, or any other exception or interrupt. (Note that the term 
exception is referred to as interrupt in the architecture specification.) 

Instruction state and all information required for completion is kept in a first-in, first-out 
queue of five completion buffers. A single completion buffer is allocated for each 
instruction once it is dispatched by the dispatch unit. A completion buffer is a required 
resource for dispatch; if there are no completion buffers available, the dispatch unit will 
stall. While a maximum of two instructions per cycle may be completed and retired in 
program order from the completion unit, instruction completion can be stalled by the 
instruction reaching the last position in the completion queue while the instruction is still 
being executed. Store instructions, and instructions executed by the FPU and SRU (with 
the exception of integer add and compare instructions) can only be retired from the last 
position in the completion queue. 

The rate of instruction completion is also affected by the 603e’s ability to write the 
instruction results from the rename registers to the architected registers when the 
instruction is retired. The 603e can perform two write-back operations from the rename 
registers to the GPRs each clock cycle, but can perform only one write back per cycle to 
the CR,FPR,LR, and CTR. 

Due to the 603e’s out-of-order execution capability, the in-order completion of instructions 
by the completion unit provides a precise exception mechanism. All program-related 
exceptions are signaled when the instruction causing the exception has reached the last 
position in the completion buffer. All prior instructions are allowed to complete before the 
exception is taken. 
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6.3.3. 1 Rename Register Operation 

To avoid contention for a given register file location in the course of out-of-order execution, 
the 603e provides rename registers for the storage of instruction results prior to their 
commitment to the architected register by the completion unit. Five rename registers are 
provided for the GPRs, four for the FPRs, and one each for the condition register, the link 
register and the count register. 

When the dispatch unit dispatches an instruction to its execution unit, it allocates a rename 
register for the results of that instruction. If an instruction is dispatched to a reservation 
station associated with an execution unit due to a data dependency, the dispatcher will also 
provide a tag to the execution unit identifying which rename register will forward the 
required data upon instruction completion. When the data is available in the rename 
register, the pending execution may begin. 

Instruction results are transferred from the rename registers to the architected registers by 
the completion unit when an instruction is retired from the completion queue without 
exceptions and after any predicted branch conditions preceding it in the completion queue 
have been resolved correctly. If a predicted branch is found to have been incorrectly 
predicted, the instructions following the branch will be flushed from the completion queue, 
and the results of those instructions will be flushed from the rename registers. 

G.3.3.2 Instruction Serialization 

While the 603e is capable of dispatching and completing two instructions per cycle, there 
is a class of instructions referred to as serializing instructions that limit dispatch and 
completion to one instruction per cycle. The type of serialization caused by these 
instructions fall into three categories — completion, dispatch, and refetch serialization. 

Completion serialized instructions are held in the execution unit until all prior instructions 
in the completion unit have been retired. Completion serialization is used for instructions 
that access or modify nonrenamed resources. Results from these instructions will not be 
available or forwarded for subsequent instructions until the serializing instruction is retired 
from the completion unit. Instructions that are completion serialized are as follows: 

• Instructions (with the exception of integer add and compare instructions) executed 
by the system register unit 

• Floating-point instructions that access or modify the FPSCR or CR (mtfsbl, mcrfs, 
mtfsfi, mffs, and mtfsf) 

• Instructions that manage caches and TLBs 

• Instructions that directly access the GPRs (load and store multiple word and load 
and store string instructions) 

• Instructions defined by the architecture to have synchronizing behavior 
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A subset of the completion serialized instructions are dispatch serialized. Dispatch 
serialized instructions inhibit the dispatching of subsequent instructions until the serializing 
instruction is retired from the completion unit. Dispatch serialization is used for 
instructions that access renamed resources used by the dispatcher, and for instructions 
requiring refetch serialization, including: 

• The load multiple instructions, Imw, Iswi, and Iswx 

• The mtspr(XER) and mcrxr instructions 

• The synchronizing instructions, sync, isync, mtmsr, rfi, and sc 

A subset of the dispatch serialized instructions are also refetch serialized. Refetch serialized 
instructions inhibit dispatching of subsequent instructions and force the refetching of 
subsequent instructions after the serializing instructions are retired from the completion 
unit. The context synchronizing instruction, isync, is a refetch serializing instruction. 

6.3.3.S Execution Unit Considerations 

As previously noted, the 603e is capable of dispatching and retiring two instructions per 
clock cycle. One of the factors affecting the peak dispatch rate is the availability of 
execution units on each clock cycle. 

For an instruction to be issued, the required execution unit must be available. The 
dispatcher monitors the availability of all execution units and suspends instruction dispatch 
if the required execution unit is not available. An execution unit may not be available if it 
can accept and execute only one instruction per cycle, or if an execution unit’s pipeline 
becomes full. This situation may occur if instruction execution takes more clock cycles than 
the number of pipeline stages in the unit, and additional instructions are issued to that unit 
to fill the remaining pipeline stages. 

6.4 Execution Unit Timings 

The following sections describe instruction timing considerations within each of the 
respective execution units in the 603e. Refer to Table 6-1 for branch instruction execution 
timing. 

6.4.1 Branch Processing Unit Execution Timing 

Flow control operations (conditional branches, unconditional branches, and traps) are 
typically expensive to execute in most machines because they disrupt normal flow in the 
instruction stream. When a change in program flow occurs, the IQ must be reloaded with 
the target instruction stream. During this time the execution units will be idle. However, 
previously issued instructions will continue to execute while the new instruction stream 
makes its way into the IQ. 
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Performance features such as branch folding and static branch prediction help minimize the 
penalties associated with flow control operations on the 603e. The timing for branch 
instruction execution is determined by many factors including the following: 

• Whether the branch is taken 

• Whether the target instruction stream is in the on-chip cache 

• Whether the branch is predicted 

• Whether the prediction is correct 

6.4.1 .1 Branch Folding 

When a branch instruction is encountered by the fetcher, the BPU immediately tries to pull 
that instruction out of the instruction stream and resolve it. When the BPU pulls the branch 
instruction out of the instruction stream, the instruction above the branch is shifted down 
to take the place of the removed branch. The technique of removing the branch instruction 
from the instruction sequence seen by the other execution units, is known as branch folding. 

Often, branch folding reduces the penalties of flow control instructions to zero since 
instruction execution proceeds as though the branch was never there. 

If the folded branch instruction changes program flow (the branch is said to be “taken” in 
this case), the BPU immediately requests the instructions at the new target from the on-chip 
cache. In most cases, the new instructions arrive in the IQ before any bubbles are 
introduced into the execution units. If the folded branch does not change program flow (the 
branch is said to be “not taken” in this case), the branch is already removed from the 
instruction stream and execution continues as if there were never a branch in the original 
sequence. 

When a conditional branch cannot be resolved due to a CR data dependency, the branch is 
executed by means of static branch prediction, and instruction fetching proceeds down the 
predicted path. If the branch prediction was incorrect when the branch is resolved, the 
instruction queue and all subsequently executed instructions are purged, instructions 
executed prior to the predicted branch are allowed to complete, and instruction fetching 
resumes down the correct path. 

There are several situations where instruction sequences create dependencies that prevent 
a branch instruction from being resolved immediately, thereby causing execution of the 
subsequent instruction stream based on the predicted outcome of the branch instruction. 
The instruction sequences, and the resulting action of the branch instruction is described as 
follows: 

• An mtspr(LK) followed by a bclr — Fetching is stopped, and the branch waits for 
the mtspr to execute. 

• An mtspr(CTR) followed by a bcctr— Fetching is stopped, and the branch waits for 
the mtspr to execute. 

• An mtspr(CTR) followed by a bc(CTR) — Fetching is stopped, and the branch waits 
for the mtspr to execute. 
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• A bc(CTR) followed by another bc(CTR) — Fetching is stopped, and the second 
branch waits for the first branch to be completed. 

• A bc(CTR) followed by a bcctr — Fetching is stopped, and the bcctr waits for the 
first branch to be completed. 

• A branch(LK = 1) followed by a branch(LK = 1) — Fetching is stopped, and the 
second branch waits for the first branch to be completed. (Note: a bl instruction does 
not have to wait for a branch(LK = 1) to complete.) 

• A bc(based-on-CR) waiting for resolution due to a CR-dependency followed by a 
bc(based-on-CR) — Fetching is stopped and the second branch waits for the first CR- 
dependency to be resolved. (Note: branch conditions can be a function of the CTR 
and the CR; if the CTR condition is sufficient to resolve the branch, then a CR- 
dependency is ignored.) 

6.4.1 .2 Static Branch Prediction 

Static branch prediction is a mechanism by which software (for example, compilers) can 
give a hint to the machine hardware about the direction the branch is likely to take. When 
a branch instruction encounters a data dependency, the BPU waits for the required 
condition code to become available. Rather than stalling instruction issue until the source 
operand is ready, the 603e predicts which path the branch instruction is likely to take, and 
instructions are fetched and executed along that path. When the branch operand becomes 
available, the branch is evaluated. If the predicted path was correct, program flow continues 
along that path uninterrupted; otherwise, the processor backs up, and program flow resumes 
along the correct path. 

There is a scenario where a flow control instruction will not be predicted on the 603e. If the 
target address of the branch (link or count register) will be modified by an instruction that 
appears before the branch instruction, the BPU must wait until the target address is 
available. 

The 603e executes through one level of prediction. The microprocessor may not predict a 
branch if a prior branch instruction is still unresolved. 

The number of instructions that can be executed after the issue of a predicted branch 
instruction is limited by the fact that no instruction executed after a predicted branch may 
actually update the register files or memory until the branch is completed. That is, 
instructions may be issued and executed, but may not reach the write-back stage in the 
completion unit. When a instruction following a predicted branch has completed execution, 
it will not be moved into the write-back stage, instead, it will simply stall in the last stage 
of the completion unit. This means that the completion queue may become full, which will 
limit the number of additional instructions that may be issued subsequent to an unresolved 
predicted branch. 
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In the case of a misprediction, the 603e is able to redirect its machine state rather painlessly 
because the programing model has not been updated. When a branch is found to be 
mispredicted, all instructions that were issued subsequent to the predicted branch 
instruction are simply flushed from the completion queue, and their results flushed from the 
rename registers. No architected register state needs to be restored because no architected 
register state was modified by the instructions following the unresolved predicted branch. 

6.4.1 .2.1 Predicted Branch Timing Examples 

Figure 6-5 depicts the cases where branch instructions are predicted, and shows both 
“taken” and “not taken” branch outcomes. During clock cycle 0, two instructions are 
dispatched to their respective execution units. Notice that the BPU has a combined 
decode/execute stage, thus the branch (instruction 1) is predicted not to be taken during 
clock cycle 1 because its source register (condition register) is not available. 

During clock cycle 2, instructions 0 and 2 progress through their pipelines. In addition, the 
branch (instruction 1) remains predicted. Notice that the next branch instruction 
(instruction 5) is not able to begin its decode/execute phase while instruction 1 is predicted. 

During clock cycle 3, instruction 0 begins its write-back stage. The write back of instruction 
0 resolves the data dependency for the first branch (instruction 1); thus the first branch 
becomes resolved and it is determined that the prediction was correct. Recall that only one 
branch may be predicted at a time; thus, when instruction 1 is resolved the BPU is free to 
predict instruction 5. 

During clock 4, the second branch instruction remains predicted while additional 
instructions move through the various pipelines. 

During clock cycle 5, the BPU realizes that the prediction made for instruction 5 was 
incorrect. Note that since instruction 6 was issued and executed conditionally, it never 
performed its write back. As a result of the misprediction, all instructions that followed the 
branch in the instruction stream must be flushed from the respective execution unit 
pipelines. Notice that instructions 6 and 7 do not continue execution since it has been 
determined that these instructions should have never been issued in the first place. Since 
the branch has been resolved, a request is sent to the on-chip cache for the new instruction 
stream (based on the execution of instruction 5). During clock 6, the new set of instructions 
are in the IQ and the appropriate dispatching begins on clock cycle 7. 
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Figure 6-5. Branch Instruction Timing 

6.4.2 Integer Unit Execution Timing 

The integer unit executes all integer and bit-field instructions. Many of these instructions 
execute in a single clock cycle. The integer unit has one execute phase in its pipeline, thus 
when a multicycle integer instruction is being executed, no other integer instructions may 
begin an execute phase. Refer to Table 6-4 for integer instruction execution timing. 

6.4.3 Floating-Point Unit Execution Timing 

The floating-point unit on the 603e executes all floating-point instructions. Execution of 
most floating-point instructions is pipelined within the FPU, allowing up to three 
instructions to be executing in the FPU concurrently. While most floating-point instructions 
execute with three- or four-cycle latentcy, and one- or two-cycle throughput, three 
instructions (fdivs, fdiv, and fres) execute with latentcies of 18 to 33 cycles. The fdivs, 
fdiv, fres, mtfsbO, mtfsbl, mtfsfli, mffs, and mtfsf instructions block the floating-point unit 
pipeline until they complete execution, and thereby inhibit the dispatch of additional 
floating-point instructions. With the exception of the mcrfs instruction, all floating-point 
instructions will immediately forward their CR results to the BPU for fast branch resolution 
without waiting for the instruction to be retired by the completion unit, and the CR updated. 
Refer to Table 6-5 for floating-point instruction execution timing. 
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6.4.4 Load/Store Unit Execution Timing 

The execution of most load and store instructions is pipelined. The LSU has two pipeline 
stages; the first stage is for effective address calculation and MMU translation, and the 
second stage is for accessing the data in the cache. Load and store instructions have a two- 
cycle latency and one-cycle throughput. Load instructions that miss in the cache block 
subsequent accesses to the cache while the cache line refill is in process. Refer to Table 6-6 
for load and store instruction execution timing. 

6.4.5 System Register Unit Execution Timing 

The majority of the instructions executed by the SRU access or modify nonrenamed 
registers, or directly access renamed registers, and generally execute in a serial manner. 
Results from these instructions will not be available or forwarded for use by subsequent 
instructions until the instruction completes and is retired. The SRU can also execute the 
integer instructions addi, addis, add, addo, cmpi, cmp, cmpli, and cmpi without 
serialization, and in parallel with another integer instruction. Refer to Section 6.3. 3. 2, 
“Instruction Serialization,” for additional information on serializing instructions executed 
by the SRU, and Table 6-2, Table 6-3, and Table 6-4 for SRU instruction execution timing. 

6.5 Memory Performance Considerations 

Due to the 603e’s instruction throughput of three instructions per clock cycle, lack of data 
bandwidth can become a performance bottleneck. In order for the 603e to approach its 
potential performance levels, it must be able to read and write data quickly and efficiently. 
If there are many processors in a system environment, one processor may experience long 
memory latencies while another bus master (for example, a direct memory access 
controller) is using the external bus. 

In order to alleviate this possible contention, the 603e provides three memory update 
modes — copy-back, write-through, and cache-inhibit. Each page of memory is specified to 
be in one of these modes. If a page is in copy-back mode, data being stored to that page is 
written only to the on-chip cache. If a page is in write-through mode, writes to that page 
update the on-chip cache on hits and always update main memory. If a page is cache- 
inhibited, data in that page will never be stored in the on-chip cache. All three of these 
modes of operation have advantages and disadvantages. A decision as to which mode to use 
depends on the system environment as well as the application. 

This section describes how performance is impacted by each memory update mode. For 
details about the operation of the on-chip cache and the memory update modes, see 
Chapter 3, “Instruction and Data Cache Operation.” 
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6.5.1 Copy-Back Mode 

When storing data while in copy-back mode, store operations for cacheable data do not 
necessarily cause an external bus cycle to update memory. Instead, memory updates only 
occur on modified line replacements, cache flushes, or when another processor attempts to 
access a specific address for which there is a corresponding modified cache entry. For this 
reason, copy-back mode may be preferred when external bus bandwidth is a potential 
bottleneck — ^for example, in a multiprocessor environment. Copy-back mode is also well 
suited for data that is closely coupled to a processor, such as local variables. 

If more than one device uses data stored in a page that is in copy-back mode, snooping must 
be enabled to allow copy -back operations and cache invalidations of modified data. The 
603e implements snooping hardware to prevent other devices from accessing invalid data. 
When bus snooping is enabled, the processor monitors the transactions of the other devices. 
For example, if another device accesses a memory location and its memory-coherent (M) 
bit is set, and the 603e’s on-chip cache has a modified value for that address, the processor 
preempts the bus transaction, and updates memory with the cache data. If the cache 
contents associated with the snooped address are unmodified, the 603e will invalidate the 
cache block. The other device is then free to attempt an access to the updated memory 
address. See Chapter 3, “Instruction and Data Cache Operation,” for complete information 
about bus snooping. 

Copy-back mode provides complete cache/memory coherency as well as maximizing 
available external bus bandwidth. 

6.5.2 Write-Through Mode 

Store operations to memory in write-through mode always update memory as well as the 
on-chip cache (on cache hits). Write-through mode is used when the data in the cache must 
always agree with external memory (for example, video memory), or when there is shared 
(global) data that may be used frequently, or when allocation of a cache line on a cache miss 
is undesirable. Automatic copy back of cached data is not performed if that data is from a 
memory page marked as write-through mode since valid cache data always agrees with 
memory. 

Stores to memory that are in write-through mode may cause a decrease in performance. 
Each time a store is performed to memory in write-through mode, the bus will be busy for 
the extra clock cycles required to perform the memory update; therefore, load operations 
that miss the on-chip cache must wait while the external store operation completes. 

6.5.3 Cache-Inhibited Accesses 

If a memory page is specified to be cache-inhibited, data from this page will not be stored 
in the on-chip cache. 
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Areas of the memory map can be cache-inhibited by the operating system software. If a 
cache-inhibited access hits in the on-chip cache, the corresponding cache line is 
invalidated. If the line is marked as modified, it is copied back to memory before being 
invalidated. 

In summary, the copy-back mode allows both load and store operations to use the on-chip 
cache. The write-through mode allows load operations to use the on-chip cache, but store 
operations cause a memory access and a cache update if the data is already in the cache. 
Lastly, the cache-inhibited mode causes memory access for both loads and stores. 

6.6 Instruction Scheduling Guidelines 

The performance of the 603e can be improved by avoiding resource conflicts and 
promoting parallel utilization of execution units through efficient instruction scheduling. 
Instruction scheduling on the 603e can be improved by observing the following guidelines: 

• Implement good static branch prediction (setting of y bit in BO field). 

• When branch prediction is uncertain, or an even probability, predict fall through. 

• To reduce mispredictions, separate the instruction that sets CR bits from the branch 
instruction that evaluates them; separation by more than nine instructions ensures 
that the CR bits will be immediately available for evaluation. 

• When branching conditionally to a location specified by count registers (CTRs) or 
link registers (LRs), or when branching conditionally based on the value in the count 
register, separate the mtspr instruction that initializes the CTR or LR from the 
branch instruction performing the evaluation. Separation of the branch instruction 
and the mtspr instruction by more than nine instructions ensures the register values 
will be immediately available for use by the branch instruction. 

• Schedule instructions such that they can dual issue. 

• Schedule instructions to minimize execution-unit-busy stalls. 

• Avoid using serializing instructions. 

• Schedule instructions to avoid dispatch stalls due to renamed resource limitations. 
— Only five instructions can be in execute-complete stage at any one time 

— Only five GPR destinations can be in execute-complete-deallocate stage at any 
one time. Note that load with update address instructions use two destination 
registers. 

— Only four FPR destinations can be in execute-complete-deallocate stage at any 
one time. 

6.6.1 Branch, Dispatch, and Completion Unit Resource 
Requirements 

This section describes the specific resources required to avoid stalls during branch 
resolution, instruction dispatching, and instruction completion. 
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6.6.1 .1 Branch Resolution Resource requirements 

The following is a list of branch instructions arid the resources required to avoid stalling the 
fetch unit in the course of branch resolution: 

• The bclr instruction requires LR availability. 

• The bcctr instruction requires CTR availability. 

• “Branch and link” instructions require shadow LR availability. 

• The “branch conditional on counter decrement and CR condition” requires CTR 
availability or the CR condition must be false, and 603e cannot be executing 
instructions following an unresolved predicted branch when the branch is 
encountered by the BPU. 

• The “branch conditional on CR condition” cannot be executed following an 
unresolved predicted branch instruction. 

6.6.1. 2 Dispatch Unit Resource Requirements 

The following is a list of resources required to avoid stalls in the dispatch unit; note that the 
two dispatch buffers are described as DQ[0] and DQ[1], where DQ[0] is the dispatch buffer 
located at the very bottom of the dispatch queue: 

• Requirements for dispatching from DQ[0] are as follows: 

— Needed execution unit available 

— Needed GPR rename register(s) available 
— Needed FPR rename registers available 
— Completion buffer is not full 

— Instruction is dispatch serialized and completion buffer is empty 
— A dispatch serialized instruction is not currently being executed 

• Requirements for dispatching from DQ[1] are as follows: 

— Instruction in DQ[0] must dispatch 

— Instruction dispatched by DQ[0] is not dispatch serialized 
— Needed execution unit is available (after dispatch from DQ[0]) 

— Needed GPR rename registers(s) are available (after dispatch from DQ[0]) 

— Needed FPR rename register is available (after dispatch from DQ[0]) 

— Completion buffer is not full (after dispatch from DQ[0]) 

— Instruction dispatched from DQ[1] is not dispatch serialized 
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6.6.1 .3 Completion Unit Resource Requirements 

The following is a list of resources required to avoid stalls in the completion unit; note that 
the two completion buffers are described as CQ[0] and CQ[1], where CQ[0] is the 
completion buffer located at the very end of the completion queue: 

• Requirements for completing an instruction from CQ[0] are as follows: 

— Instruction in CQ[0] must be finished 

— Instruction in CQ[0] must not follow an unresolved predicted branch 
— Instruction in CQ[0] must not cause an exception 

• Requirements for completing an instruction from CQ[1] are as follows: 

— Instruction in CQ[0] must complete in same cycle 

— Instruction in CQ[1] must be finished 

— Instruction in CQ[1] must not follow an unresolved predicted branch 
— Instruction in CQ[1] must not cause an exception 
— Instruction in CQ[1] must be an integer or load instruction 
— Number of CR updates from both CQ[0] and CQ[1] must not exceed one 
— Number of GPR updates from both CQ[0] and CQ[1] must not exceed two 
— Number of FPR updates from both CQ[0] and CQ[1] must not exceed one 

6.7 Instruction Latency Summary 

Table 6-1 through Table 6-6 list the latencies associated with each instruction executed by 
the 603e. Note that the instruction latency tables contain no 64-bit architected instructions. 
These instructions will trap to an illegal instruction exception handler when encountered. 
Recall that the term latency is defined as the total time it takes to execute an instruction and 
make ready the results of that instruction. 

Table 6-1 provides the latencies for the branch instructions. 



Table 6-1 . Branch Instructions 



Primary 


Extended 


Mnemonic 


Unit 


Cycles 


16 


... 


bc[l][a] 


BPU 


1* 


18 


... 


b[l][a] 


BPU 


1* 


19 


016 


bcir[l] 


BPU 


r 


19 


528 


bcctr[l] 


BPU 


r 



•These operations may be folded for an effective cycle time of 0. 
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Table 6-2 provides the latencies for the system register instructions. 

Table 6-2. System Register Instructions 



Primary 


Extended 


Mnemonic 


Unit 


Cycles 


17 


"“1 


sc 


SRU 


3 


19 


050 


rfi 


SRU 


3 


19 


150 


isync 


SRU 


1& 


31 


083 


mfmsr 


SRU 


1 


31 


146 


mtmsr 


SRU 


2 


31 


210 


mtsr 


SRU 


2 


31 


242 


mtsrin 


SRU 


2 


31 


339 


mfspr (not l/DBATs) 


SRU 


1 


31 


339 


mfspr (DBATs) 


SRU 


3& 


31 


339 


mfspr (IBATs) 


SRU 


3& 


31 


467 


mtspr (not IBATs) 


SRU 


2 (XER-&) 


31 


467 


mtspr (IBATs) 


SRU 


2& 


31 


595 


mfsr 


SRU 


3& 


31 


598 


sync 


SRU 


1& 


31 


659 


mfsrin 


SRU 


3& 


31 


854 


eieio 


SRU 


1 


31 


371 


mftb 


SRU 


1 


31 


467 


mttb 


SRU 


1 



Note; Cycle times marked with require a variable number of cycles due to 
serialization. 



Table 6-3 provides the latencies for the condition register logical instructions. 



Table 6-3. Condition Register Logical Instructions 
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Table 6-3. Condition Register Logical Instructions (Continued) 



Primary 


Extended 


Mnemonic 


Unit 


Cycles 


19 


417 


crorc 


SRU 


1 


19 


449 


cror 


SRU 


1 


31 


019 


mfcr 


SRU 


1 


31 


144 


mtcrf 


SRU 


1 


31 


512 


mcrxr 


SRU 


1& 



Note: Cycle times marked with require a variable number of 
cycles due to serialization. 

Table 6-4 provides the latencies for the integer instructions. 

Table 6-4. Integer Instructions 



Primary 


Extended 


Mnemonic 


Unit 


Cycles 


03 


- 


twi 


Integer 


2 


07 


” 


mulli 


Integer 


2,3 


08 


... 


subfic 


Integer 


1 


10 


— 


cmpli 


integer 

&SRU 


1A 


11 


... 


cmpi 


Integer 

&SRU 


1A 


12 


... 


addic 


Integer 


1 


13 


... 


addic. 


Integer 


1 


14 


... 


addi 


Integer 

&SRU 


1 


15 


... 


addis 


Integer 

&SRU 


1 


20 


... 


rlwimi[.] 


Integer 


1 


21 


... 


riwinm[.] 


Integer 


1 


23 


... 


rlwnm[.] 


Integer 


1 


24 


... 


ori 


Integer 


1 


25 


... 


oris 


Integer 


1 


26 




xori 


Integer 


1 


27 




xoris 


Integer 


1 


28 




andi. 


Integer 


1 


29 


-- 


andis. 


Integer 


1 


31 


000 


cmp 


Integer 

&SRU 


1^ 
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Table 6-4. Integer Instructions (Continued) 



Primary 


Extended 


Mnemonic 


Unit 


Cyclee 


31 


004 


tw 


Integer 


2 


31 


008 


8Ubfc[o][.] 


Integer 


1 


31 


010 


addc[o][.] 


Integer 


1 


31 


oil 


mulhwu[.] 


Integer 


2.3, 4, 5, 6 


31 


024 

- 


8lw[.] 


Integer 


1 


31 


026 


cntizw[.] 


Integer 


1 


31 


028 


and[.] 


Integer 


1 


31 


032 


cmpi 


Integer 

&SRU 


1A 


31 


040 


8Ubf[.] 


Integer 


1 


31 


060 


andc[.] 


Integer 


1 


31 


075 


mulhw[.] 


Integer 


2, 3, 4, 5 




31 


104 


neg[o][.) 


Integer 


1 


31 


124 


nor[.] 


Integer 


1 


31 


136 


8Ubfe[o][.] 


Integer 


1 


31 


138 


adde[o][.] 


Integer 


1 


31 


200 


8ubfze[o][.] 


Integer 


1 


31 


202 


addze[o][.] 


Integer 


1 


31 


232 


8ubfme[o][.] 


Integer 


1 


31 


234 


addme[o][.] 


Integer 


1 


31 


235 


muli[o][.] 


Integer 


2, 3, 4, 5 


31 


266 


addto][.) 


Integer 

&SRU' 


1 


31 


284 


eqv[.J 


Integer 


1 


31 


316 


xorl.l 


Integer 


1 


31 


412 


orc[.) 


Integer 


1 


31 


444 


or{.] 


Integer 


1 


31 


459 


divwu[o][.] 


Integer 


37 


31 


476 


nand[.] 


Integer 


1 


31 


491 


divw[o]I.] 


Integer 


37 


31 


536 


8rw[.] 


Integer 


1 


31 


792 


8raw[.] 


Integer 


1 


31 


824 


8rawi[.] 


Integer 


1 
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Table 6-4. Integer Instructions (Continued) 



Primary 


Extended 


Mnemonic 


Unit 


Cycles 


31 


922 


extsh[.] 


Integer 


1 


31 


954 


extsb[.] 


Integer 


1 



Note: Cycle times marked with immediately forward their CR 
results to the BPU for fast branch resolution. 

1. The SRU can only execute the add and add[o] Instructions. 



Table 6-5 provides the latencies for the floating-point instructions. 



Table 6-5. Floating-Point Instructions 



Primary 


Extended 


Mnemonic 


Unit 


Cycles 


59 


018 


fdivs[.] 


FPU 


18^ 


59 


020 


fsubs[.] 


FPU 


l-l-IA 


59 


021 


fadds[.] 


FPU 


1-1-1A 


59 


024 


fres[.] 


FPU 


18^ 


59 


025 


fmuls[.] 


FPU 


1-1-1A 


59 


028 


fmsubs[.] 


FPU 


M-1A 


59 


029 


fmadds[.] 


FPU 


1-1-1A 


59 


030 


fnmsubs[.] 


FPU 


1-1-1A 


59 


031 


fnmadds[.] 


FPU 


1-1-1A 


63 


000 


fcmpu 


FPU 


1-MA 


63 


012 


frsp[.] 


FPU 


1-1-1A 


63 


014 


fctiw[.] 


FPU 


1-1-1A 


63 


015 


fctiwz[.] 


FPU 


1-1-1A 


63 


018 


fdlv[.] 


FPU 


33A 


63 


020 


fsub[.] 


FPU 


1-1-1A 


63 


021 


fadd[.] 


FPU 


1-1-1A 


63 


023 


fsel[.] 


FPU 


1-1-1A 


63 


025 


fmul[.] 


FPU 


2-1-1A 


63 


026 


frsqrte[.] 


FPU 


1-1-1A 


63 


028 


fmsub[.] 


FPU 


2-1-1 A 


63 


029 


fmadd[.] 


FPU 


2-1-1A 


63 


030 


fnmsub[.] | 


FPU 


2-1-1A 


63 


031 


1 

fnmadd[.] 


FPU 


2-1-1A 


63 


032 


tempo 


FPU 


1-1-1A 
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Table 6-5. Floating-Point Instructions (Continued) 



Primary 


Extended 


Mnemonic 


Unit 


Cycies 


63 


038 


mtfsb1[.] 


FPU 


1-M&A 


63 


040 


fneg[.J 


FPU 


1-1-1A 


63 


064 


mcrfs 


FPU 


1-1-1& 


63 


070 


mtfsb0[.] 


FPU 


1-1-1&A 


63 


072 


fmr[.] 


FPU 


M-1A 


63 


134 


mtfsfi[.] 


FPU 


1 1 1&A 


63 


136 


fnabs[.] 


FPU 


1-1-1A 


63 


264 


fabs[.] 


FPU 


1-MA 


63 


583 


mffs[.] 


FPU 


M-1&A 


63 


711 


mtfsff.] 


FPU 


1-M&A 



Notes: Cycle times marked with require a variable number of 
cycles due to completion serialization. 

Cycle times marked with immediately forward their CR 
results to the BPU for fast branch resolution. 

Cycle times marked with a specify the number of clock 
cycles in each pipeline stage. Instructions with a single entry 
In the cycles column are not pipelined. 



Table 6-6 provides latencies for the load and store instructions. 



Table 6-6. Load and Store Instructions 
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Table 6-6. Load and Store Instructions (Continued) 



Primary 


Extended 


Mnemonic 


Unit 


Cycles 


31 


279 


Ihzx 


LSU 


2:1 


31 


306 


tibie 


LSU 


3& 


31 


310 


eciwx 


LSU 


2:1 


31 


311 


Ihzux 


LSU 


2:1 


31 


343 


lhax 


LSU 


2:1 


31 


375 


lhaux 


LSU 


2:1 


31 


407 


sthx 


LSU 


2:1 


31 


438 


ecowx 


LSU 


2:1 


31 


439 


sthux 


LSU 


2:1 


31 


470 


dcbi 


LSU 


2& 


31 


533 


Iswx 


LSU 


2 + n& 


31 


534 


Iwbrx 


LSU 


2:1 


31 


535 


Ifsx 


LSU 


2:1 


31 


567 


Ifsux 


LSU 


2:1 


31 


597 


iswi 


LSU 


2 + n& 


31 


599 


Ifdx 


LSU 


2:1 


31 


631 


ifdux 


LSU 


2:1 


31 


661 


stswx 


LSU 


1 + n& 


31 


662 


stwbrx 


LSU 


2:1 


31 


663 


stfsx 


LSU 


2:1 


31 


695 


stfsux 


LSU 


2:1 


31 


725 


stswi 


LSU 


1 + n& 


31 


727 


stfdx 


LSU 


2:1 


31 


759 


stfdux 


LSU 


2:1 


31 


790 


Ihbrx 


LSU 


2:1 


31 


918 


sthbrx 


LSU 


2:1 


31 


978 


tlbld 


LSU 


2& 


31 


982 


icbi 


LSU 


3& 


31 


983 


stfiwx 


LSU 


2:1 


31 


1010 


tibii 


LSU 


3& 


31 


1014 


dcbz 


LSU 


10& 


32 


... 


Iwz 


LSU 


2:1 
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Table 6-6. Load and Store Instructions (Continued) 



Primary 


Extended 


Mnemonic 


Unit 


Cycles 


33 


— 


Iwzu 


LSU 


2:1 


34 


... 


Ibz 


LSU 


2:1 


35 


... 


Ibzu 


LSU 


2:1 


36 


... 


stw 


LSU 


2:1 


37 


... 


stwu 


LSU 


2:1 


38 


... 


stb 


LSU 


2:1 


39 


... 


stbu 


LSU 


2:1 


40 


... 


Ihz 


LSU 


2:1 


41 


... 


Ihzu 


LSU 


2:1 


42 


... 


lha 


LSU 


2:1 


43 


... 


lhau 


LSU 


2:1 


44 


... 


sth 


LSU 


2:1 


45 


... 


sthu 


LSU 


2:1 


46 


... 


Imw 


LSU 


2-i-n& 


47 


... 


stmw 


LSU 


1 +n& 


48 


... 


ifs 


LSU 


2:1 


49 


... 


ifsu 


LSU 


2:1 


50 


... 


ltd 


LSU 


2:1 


51 


... 


Ifdu 


LSU 


2:1 


52 




stfs 


LSU 


2:1 


53 


... 


stfsu 


LSU 


2:1 


54 


... 


stfd 


LSU 


2:1 


55 


... 


stfdu 


LSU 


2:1 



Notes: Cycle times marked with require a variable number of cycles 
due to serialization. 



Cycle times marked with a Tspecify hit and miss times for 
cache management instructions that require conditional bus 
activity. 

Cycle times marked with a specify cycles of total latentcy 
and throughput for pipelined load and store Instructions. 

Load and store multiple and string instruction cycles are shown 
as a fixed number of cycles plus a variable number of cycles 
where “n” is the number of words accessed by the instruction. 
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Chapter 7 

Signal Descriptions 

This chapter describes the PowerPC 603e microprocessor’s external signals. It contains a 
concise description of individual signals, showing behavior when the signal is asserted and 
negated and when the signal is an input and an output. 

NOTE 

A bar over a signal name indicates that the signal is active 
low — ^for example, ARTRY (address retry) and TS (transfer 
start). Active-low signals are referred to as asserted (active) 
when they are low and negated when they are high. Signals that 
are not active-low, such as AP0-AP3 (address bus parity 
signals) and TT0-TT4 (transfer type signals) are referred to as 
asserted when they are high and negated when they are low. 

The 603e signals are grouped as follows: 

• Address arbitration signals — The 603e uses these signals to arbitrate for address bus 
mastership. 

• Address transfer start signals — These signals indicate that a bus master has begun a 
transaction on the address bus. 

• Address transfer signals — ^These signals, which consist of the address bus, address 
parity, and address parity error signals, are used to transfer the address and to ensure 
the integrity of the transfer. 

• Transfer attribute signals — ^These signals provide information about the type of 
transfer, such as the transfer size and whether the transaction is bursted, write- 
through, or cache-inhibited. 

• Address transfer termination signals — ^These signals are used to acknowledge the 
end of the address phase of the transaction. They also indicate whether a condition 
exists that requires the address phase to be repeated. 

• Data arbitration signals — ^The 603e uses these signals to arbitrate for data bus 
mastership. 

• Data transfer signals — These signals, which consist of the data bus, data parity, and 
data parity error signals, are used to transfer the data and to ensure the integrity of 
the transfer. 
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• Data transfer termination signals — ^Data termination signals are required after each 
data beat in a data transfer. In a single-beat transaction, the data termination signals 
also indicate the end of the tenure, while in burst accesses, the data termination 
signals apply to individual beats and indicate the end of the tenure only after the final 
data beat. They also indicate whether a condition exists that requires the data phase 
to be repeated. 

• System status signals — These signals include the external interrupt signal, 
checkstop signals, and both soft- and hard-reset signals. These signals are used to 
interrupt and, under various conditions, to reset the processor. 

• JTAG/COP interface signals — The JTAG (IEEE 1 149. 1) interface and common on- 
chip processor (COP) unit provides a serial interface to the system for performing 
monitoring and boundary tests. 

• Processor status — ^These signals include the memory reservation signal, machine 
quiesce control signals, time base enable signal, and TLBI Sync signal. 

• Clock signals — These signals provide for system clock input and frequency control. 
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7.1 Signal Configuration 

Figure 7-1 illustrates the 603e microprocessor’s signal configuration, showing how the 
signals are grouped. 



NOTE 

A pinout showing actual pin numbers is included in the 603e 
hardware specifications. 
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Figure 7-1. PowerPC 603e Microprocessor Signai Groups 

7.2 Signal Descriptions 

This section describes individual 603e signals, grouped according to Figure 7-1. Note that 
the following sections are intended to provide a quick summary of signal functions. 
Chapter 8, “System Interface Operation,” describes many of these signals in greater detail, 
both with respect to how individual signals function and how groups of signals interact. 
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7.2.1 Address Bus Arbitration Signals 

The address arbitration signals are a collection of input and output signals the 603e uses to 
request the address bus, recognize when the request is granted, and indicate to other devices 
when mastership is granted. For a detailed description of how these signals interact, see 
Section 8.3.1, “Address Bus Arbitration.” 

7.2.1 .1 Bus Request (BR)— Output 

The bus request (BR) signal is an output signal on the 603e. Following are the state 
meaning and timing comments for the BR signal. 

State Meaning Asserted — ^Indicates that the 603e is requesting mastership of the 

address bus. Note that BR may be asserted for one or more cycles, 
and then de-asserted due to an internal cancellation of the bus request 
(for example, due to a load hit in the touch load buffer). See 
Section 8.3.1, “Address Bus Arbitration.” 

Negated — Indicates that the 603e is not requesting the address bus. 
Th e 603e m ay have no bus operation pending, it may be parked, or 
the ARTRY input was asserted on the previous bus clock cycle. 

Timing Comments Assertion — Occurs when the 603e is not parked and a bus 

transaction is needed. This may occur even if the two possible 
pipeline accesses have occurred. BR will also be asserted for one 
cycle during the execution of a dcbz instruction, and during the 
execution of a load instruction which hits in the touch load buffer. 

Negation — Occurs for at least o ne bu s clock cycle after an accepted, 
qualified bus grant (see BG and ABB), even if another transaction is 
pending. It i s also ne gated for at least one bus clock cycle when the 
assertion of ARTRY is detected on the bus. 

7.2.1 .2 Bus Grant (BG)— Input 

The bus grant (BG) signal is an input signal on the 603e. Following are the state meaning 
and timing comments for the BG signal. 

State Meaning Asserted — Indicates that the 603e may, with the proper qualification, 

assume mastership of the addre ss b us. A qua lified bus gra nt occurs 
when BG is a sserte d an d ABB a nd ARTRY (after AACK) are not 
asserted. The ABB and ARTRY signals ^driven by the 603e or 
other bus masters. If the 603e is parked, BR need not be asserted for 
the qualified bus grant. See Section 8.3.1, “Address Bus 
Arbitration.” 

Negated — Indicates that the 603e is not the next potential address 
bus master. 

Timing Comments Assertion — ^May occur at any time to indicate the 603e is free to use 
the address bus. After the 603e assumes bus mastership, it does not 
check for a qualified bus grant again until the cycle during which the 
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address bus tenure is completed (assuming it has another transaction 
to run). The 603e does not acce pt a BG in the cycles between the 
assertion of any TS and AACK. 

Negation — ^May occur at any time to indicate the 603e cannot use the 

bus. The 603e may still assume bus mastership on the bus clock 

cycle of the negation of BG because during the previous cycle BG 
indicated to the 603e that it was free to take mastership (if qualified). 

7.2.1 .3 Address Bus Busy (ABB) 

The address bus busy (ABB) signal is both an input and an output signal. 

7.2.1. 3.1 Address Bus Busy (ABB) — Output 

Following are the state meaning and timing comments for the ABB output signal. 

State Meaning Asserted— Indicates that the 603e is the address bus master. See 

Section 8.3.1, “Address Bus Arbitration.” 

Nega ted — Indicates that the 603e is not using the address bus. If 
ABB is negated during the bus clock cycle following a qualified bus 
grant, the 603e did not accept mastership, even if BR was asserted. 
This can occur if a potential transaction is aborted internally before 
the transaction is started. 

Timing Comments Assertion — Occurs on the bus clock cycle following a qualified BG 
that is accepted by the processor (see Negated). 

Negation — Occurs for a minimum o f one- half bus clock cycle 
following the assertion of AACK. If ABB is negated during the bus 
clock cycle following a qualified bus grant, the 603e did not accept 
mastership, even if BR was asserted. 

High Impedance — Occurs after ABB is negated. 

7.2.1 .3.2 Address Bus Busy (ABB) — Input 

Following are the state meaning and timing comments for the ABB input signal. 

State Meaning Asserted — ^Indicates that the address bus is in use. This condition 

effectively blocks the 603e from assuming address bus ownership, 
regardless of the BG input; see Section 8.3.1, “Address Bus 
Arbitration.” 

Negated — Indicates that the address bus is not owned by another bus 
master and that it is available to the 603e when accompanied by a 
qualified bus grant. 

Timing Comments Assertion — ^May occur when the 603e must be prevented fro m usin g 
the address bus (and the processor is not currently asserting ABB). 

Negation— May occur whenever the 603e can use the address bus. 
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7.2.2 Address Transfer start Signals 

Address transfer start signals are input and output signals that indicate that an address bus 
transfer has begun. The transfer start (TS) signal identifies the operation as a memory 
transaction. 

For detailed information about how TS interacts with other signals, refer to Section 8.3.2, 
“Address Transfer.” 

7.2.2.1 Transfer Start (TS) 

The TS signal is both an input and an output signal on the 603e. 

7.2.2.1.1 Transfer Start (TS) — Output 

Following are the state meaning and timing comments for the TS output signal. 

State Meaning Asserted — ^Indicates that the 603e has begun a memory bus 

transaction and that the address bus and transfer attribute signals are 
valid. When asserted with the appropriate TT0-TT4 signals it is also 
an implied data bus request for a memory transaction (unless it is an 
address-only operation). 

Negated — Indicates that no bus transaction is occurring during 
normal operation. 

Timing Comments Assertion — Coincides with the assertion of ABB. 

Negation — Occurs one bus clock cycle after TS is asser ted. 

High Impedance — Coincides with the negation of ABB. 

7.2.2.1.2 Transfer Start (TS)— Input 

Following are the state meaning and timing comments for the TS input signal. 

State Meaning Asserted — ^Indicates that another master has begun a bus transaction 

and that the a ddress bus and transfer attribute signals are valid for 
snooping (see GBL). 

Negated — Indicates that no bus transaction is occurring. 

Timing Comments Assertion — ^May occur during the assertion of ABB. 

Negation — ^Must occur one bus clock cycle after TS is asserted. 

7.2.3 Address Transfer Signals 

The address transfer signals are used to transmit the address and to generate and monitor 
parity for the address transfer. For a detailed description of how these signals interact, refer 
to Section 8.3.2, “Address Transfer.” 

7.2.3.1 Address Bus (A0-A31) 

The address bus (A0-A31) consists of 32 signals that are both input and output signals. 
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7.2.3.1.1 Address Bus (A0-A31)— Output 

Following are the state meaning and timing comments for the A0-A31 output signals. 

State Meaning Asserted/Negated — Represents the physical address (real address in 

the architecture specification) of the data to be transferred. On burst 
transfers, the address bus presents the double-word-aligned address 
containing the critical code/data that missed the cache on a read 
operation, or the first double word of the cache line on a write 
operation. Note that the address output during burst operations is not 
incremented. See Section 8.3.2, “Address Transfer.” 

Timing Comments Assertion/Negation — Occurs on the b us clo ck cycle after a qualified 
bus grant (coincides with assertion of ABB and TS). 

High Impedance — Occurs one bus clock cycle after AACK is 
asserted. 

7.2.3.1.2 Address Bus (A0-A31)— Input 

Following are the state meaning and timing comments for the AO- A3 1 input signals. 

State Meaning Asserted/Negated — Represents the physical address of a snoop 

operation. 

Timing Comments Assertion/Negation — ^Must occur on the same bus clock cycle as the 
assertion of TS; is sampled by 603e only on this cycle. 

7.2.3.2 Address Bus Parity (AP0-AP3) 

The address bus parity (AP0-AP3) signals are both input and output signals reflecting one 
bit of odd-byte parity for each of the 4 bytes of address when a valid address is on the bus. 

7.2.3.2.1 Address Bus Parity (AP0-AP3)— Output 

Following are the state meaning and timing comments for the AP0-AP3 output signal on 
the 603e. 

State Meaning Asserted/Negated — Represents odd parity for each of 4 bytes of the 

physical address for a transaction. Odd parity means that an odd 
number of bits, including the parity bit, are driven high. The signal 
assignments correspond to the following: 

APO A0-A7 
API A8-A15 
AP2 A16-A23 
AP3 A24-A31 

For more information, see Section 8.3.2.1, “Address Bus Parity.” 

Timing Comments Assertion/Negation — ^The same as A0-A3 1 . 

High Impedance — ^The same as AO- A3 1 . 
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7.2.S.2.2 Address Bus Parity (AP0-AP3) — Input 

Following are the state meaning and timing comments for the AP0~AP3 input signal on the 
603e. 

State Meaning Asserted/Negated — ^Represents odd parity for each of 4 bytes of the 

physical address for snooping operations. Detected even parity 
causes the processor to take a machine check exception or enter the 
checkstop state if address parity checking is enabled in the HIDO 
register; see Section 2. 1.2.1, “Ha rdwar e Implementation Registers 
(HIDO and HIDl).” (See also the APE signal description.) 

Timing Comments Assertion/Negation — ^The same as A0-A3 1 . 

7.2.3.3 Address Parity Error (APE)— Output 

The address parity error (APE) signal is an output signal on the 603e. Note that the (APE) 
signal is an open-drain type output, and requires an e xtern al pull-up resistor (for example, 
10 kt2 to Vdd) to assure proper de-assertion of the (APE) signal. Follow ing are the state 
meaning and timing comments for the APE signal on the 603e. The APE signal will not be 
asserted if address parity checking is disabled (HID0[EBA] cleared to 0). For more 
information, see Section 8.3.2. 1, “Address Bus Parity.” 

State Meaning Asserted — Indicates inc orrect address bus parity has been detected 

by the 603e on a snoop (GBL asserted). 

Negated — Indicates that the 603e has not detected a parity error 
(even parity) on the address bus. 

Timing Comments Assertion — Occurs on the second bus clock cycle after TS is 
asserted. 

High Impedance — Occurs on the third bus clock cycle after TS is 
asserted. 

7.2.4 Address Transfer Attribute Signals 

The transfer attribute signals are a set of signals that further characterize the transfer — such 
as the size of the transfer, whether it is a read or write operation, and whether it is a burst 
or single-beat transfer. For a detailed description of how these signals interact, see 
Section 8.3.2, “Address Transfer.” 

Note that some signal functions vary depending on whether the transaction is a memory 
access or an I/O access. 

7.2.4.1 Transfer Type (TT0-TT4) 

The transfer type (TT0-TT4) signals consist of five input/output signals on the 603e. For a 
complete description of TTO-TT4 signals and for transfer type encodings, see Table 7-1. 
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7.2.4.1.1 Transfer Type (TT0-TT4) — Output 

Following are the state meaning and timing comments for the TT0-TT4 output signals on 
the 603e. 

State Meaning Asserted/Negated — Indicates the type of transfer in progress. 

Timing Comments Assertion/Negation/High Impedance — ^The same as A0-A3 1 . 

7.2.4.1.2 Transfer Type (TT0-TT4)— Input 

Following are the state meaning and timing comments for the TT0-TT3 input signals on 
the 603e. 

State Meaning Asserted/Negated — Indicates the type of transfer in progress (see 

Table 7-2). 

Timing Comments Assertion/Negation — The same as A0-A3 1 . 

Table 7-1 describes the transfer encodings for a 603e bus master. 



Table 7-1. Transfer Encoding for PowerPC 603e Processor Bus Master 



603e Bus 
Master 
Transaction 


Transaction 

Source 


TTO 


TT1 


TT2 


TT3 


TT4 


60x Bus 
Specification 
Command 


Transaction 


N/A 


N/A 


0 


0 


0 


0 


0 


Clean block 


Address only 


N/A 


N/A 


0 


0 


1 


0 


0 


Flush block 


Address only 


N/A 


N/A 


0 


1 


0 


0 


0 


sync 


Address only 


Address only 


dcbz 


0 


1 


1 


0 


0 


Kill block 


Address only 


N/A 


N/A 


1 


0 


0 


0 


0 


eieio 


Address only 


Single-beat 
write (nonGiBL) 


ecowx 


1 


0 


1 


H 


H 


External control 
word write 


Single-beat 

write 


N/A 


N/A 


1 


1 


0 


0 


0 


TLB invalidate 


Address only 


Single-beat read 
(nonGBL) 


eciwx 


1 


1 


1 


0 


0 


External control 
word read 


Single-beat 

read 


N/A 


N/A 


0 


H 


H 


0 


1 


iwarx 

Reservation set 


Address only 


N/A 


N/A 


0 


0 


1 


0 


1 


Reserved 


— 


N/A 


N/A 


0 


1 


0 


0 


1 


tibsync 


Address only 


N/A 


N/A 


0 


1 


1 


0 


1 


icbi 


Address only 


N/A 


N/A 


1 


D 


D 


0 


1 


Reserved 


— 


Single-beat 

write 


Caching- 
inhibited or write- 
through store 


0 


0 


■ 


1 


0 


Write-with-flush 


Single-beat 
write or burst 


Burst (nonGBL) 


Cast-out, or 
snoop copyback 


0 


0 


1 


1 


0 


Write-with-klll 


Single-beat 
write or burst 
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Table 7-1. Transfer Encoding for PowerPC 603e Processor Bus Master (Continued) 



603e Bus 
Master 
Transaction 


Transaction 

Source 


Single-beat read 


Caching- 
inhibited load or 
instruction fetch 


Burst 


Load miss, store 
miss, or 
instruction fetch 


Single-beat 

write 


stwcx. 


N/A 


N/A 


Single-beat read 


Iwarx (caching- 
inhibited load) 


Burst 


Iwarx 

(load miss) 


N/A 


N/A 


N/A 


N/A 


N/A 


N/A 


N/A 


N/A 


N/A 


N/A 



60x Bus 

TT1 TT2 TT3 TT4 Specification Transaction 
Command 



10 10 Read 




0 1 



1 



1 1 



X 1 



Read-with-intent- 

to-modify 



Wrlte-wlth-flush- 

atomic 



Reserved 



Read-atomic 



Read-with-intent- 

to-modify-atomic 



Reserved 



Reserved 



Read-with-no- 

intent-to-cache 



Reserved 



Reserved 



Single-beat 
read or burst 



Table 1-2 describes the 60x bus specification transfer encodings and the 603e bus snoop 
response on an address hit. 



Table 7-2. PowerPC 603e Microprocessor Snoop Hit Response 



60x Bus Specification 
Command 


Transaction 


TTO 


TT1 


TT2 


TT3 


TT4 


603e Bus 
Snooper; 
Action on Hit 


Clean block 


Address only 


0 


0 


0 


0 


0 


N/A 


Flush block 


Address only 


0 


0 


1 


0 


0 


N/A 


sync 


Address only 


0 


1 


0 


0 


0 


N/A 


Kill block 


Address only 


0 


1 


■ 


B 


0 


Kill, cancel 
reservation 


eieio 


Address only 


1 


0 


0 


0 


0 


N/A 


External control word write 


Single-beat write 


1 


0 


1 


0 


0 


N/A 


TLB Invalidate 


Address only 


1 


1 


0 


0 


0 


N/A 


External control word read 


Single-beat read 


1 


1 


1 


0 


m 


N/A 
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Table 7-2. PowerPC 603e Microprocessor Snoop Hit Response (Continued) 



60x Bus Specification 
Command 


Transaction 


TTO 


TT1 


TT2 


TT3 


TT4 


— 
603e Bus 
Snooper; 
Action on Hit 


Iwarx 

Reservation set 


Address only 


0 


0 


0 


0 


1 


N/A 


stwcx. 

Reservation clear 


Address only 


0 


0 


1 


0 


1 


N/A 


tibsync 


Address only 


0 


1 


0 


0 


1 


N/A 


icbi 


Address only 


0 


1 


1 


0 


1 


N/A 


Reserved 


— 


1 


D 


X 


0 


1 


N/A 


Write-with-flush 


Single-beat write or burst 


0 


0 


0 


1 


0 


Flush, cancel 
reservation 


Write-with-kill 


Single-beat write or burst 


0 


0 


1 


1 


0 


Kill, cancel 
reservation 


Read 


Single-beat read or burst 


0 


1 


0 


1 


0 


Clean or flush 


Read-with-intent-to-modify 


Burst 


0 


1 


1 


1 


0 


Flush 


Write-with-flush-atomic 


Single-beat write 


1 


0 


0 


1 


0 


Flush, cancel 
reservation 


Reserved 


N/A 


1 


0 


1 


1 


0 


N/A 


Read-atomic 


Single-beat read or burst 


1 


1 


0 


1 


0 


Clean or flush 


Read-with-intent-to modify- 
atomic 


Burst 


1 


1 


1 


1 


0 


Flush 


Reserved 


-- 


0 


0 


0 


1 


1 


N/A 


Reserved 


— 


0 


0 


1 


1 


1 


N/A 


Read-with-no-intent-to-cache 


Single-beat read or burst 


0 


1 

1 


0 


1 


1 


Clean 


Reserved 


— 


0 


1 


1 


1 


1 


N/A 


Reserved 


— 


1 


X 


X 


1 


1 


N/A 



7.2.4.2 Transfer Size (TSIZ0-TSIZ2)— Output 

The transfer size (TSIZ0-TSIZ2) signals consist of three output signals on the 603e. 

Following are the state meaning and timing comments for the TSIZ0-TSIZ2 output signals 

on the 603e. 

State Meaning Assert ed/Negated — For memory accesses, these signals along with 

TEST, indicate the data transfer size for the current bus operation, as 
shown in Table 7-3. Table 8-4 shows how the transfer size signals 
are used with the address signals for aligned transfers. Table 8-5 
shows how the transfer size signals are used with the address signals 
for misaligned transfers. 
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For external control instructions (eciwx and ecowx), TSIZ0-TSIZ2 
are used to output bits 29-31 of the exte rnal ac cess register (EAR), 
which are used to form the resource ID (TBSTIITSIZ0-TSIZ2). 

Timing Comments Assertion/Negation — ^The same as A0-A3 1 . 

High Impedance— The same as A0--A3 1 . 



Table 7-3. Data Transfer Size 



TiST 


TSIZ0-TSIZ2 


Transfer Size 


Asserted 


010 


Burst (32 bytes) 


Negated 


000 


8 bytes 


Negated 


001 


1 byte 


Negated 


010 


2 bytes 


Negated 


oil 


3 bytes 


Negated 


100 


4 bytes 


Negated 


101 


5 bytes 


Negated 


110 


6 bytes 


Negated 


111 


7 bytes 



7.2.4.3 Transfer Burst (TBST) 

The transfer burst (TBST) signal is an input/output signal on the 603e. 

7.2.4.3.1 Transfer Burst (TBST)— Output 

Following are the state meaning and timing comments for the TBST output signal. 

State Meaning Asserted — ^Indicates that a burst transfer is in progress. 

Negated — Indicates that a burst transfer is not in progress. 

For external control instructions (eciwx and ecowx), TBST is used 
to output bit 28 of the EAR, which is used to form the resource ID 
(TBSTIITSIZ0-TSIZ2). 

Timing Comments Assertion/Negation — ^The same as A0-A3 1 . 

High Impedance — ^The same as AO- A3 1 . 

T.2.4.3.2 Transfer Burst (TBST) — Input 

Following are the state meaning and timing comments for the TBST input signal. 

State Meaning Asserted/Negated — ^Used when snooping for single-beat reads (read 

with no intent to cache). 

Timing Comments Assertion/Negation— The same as A0-A3 1 . 
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7.2AA Transfer Code (TC0-TC1) — Output 

The transfer code (TCO-TCl) consists of two output signals on the 603e. Following are the 
state meaning and timing comments for the TCO-TCl signals. 

State Meaning Asserted/Negated — Represents a special encoding for the transfer in 

progress (see Table 7-4). 

Timing Comments Assertion/Negation — ^The same as A0-A3 1 . 

High Impedance — ^The same as A0-A31. 



Table 7-4. Encodings for TCO-TCl Signals 



TC(O-I) 


Read 


Write 


00 


Data transaction 


Any write 


0 1 


Touch load 


— 


1 0 


Instruction fetch 


-- 


1 1 


Reserved 


— 



7-2.4-5 Cache Inhibit (Cl)— Output 

The cache inhibit (Cl) signal is an output signal on the 603e. Following are the state 
meaning and timing comments for the Cl signal. 

State Meaning Asserted — ^Indicates that a single-beat transfer will not be cached, 

reflecting the setting of the I bit for the block or page that contains 
the address of the current transaction. 

Negated — Indicates that a burst transfer will allocate a line in the 
603e data cache. 

Timing Comments Assertion/Negation — ^The same as AO- A3 1 . 

High Impedance — ^The same as A0-A31. 

7.2.4.6 Write-Through (WT)— Output 

The write-through (WT) signal is an output signal on the 603e. Following are the state 
meaning and timing comments for the WT signal. 

State Meaning Asserted — ^Indicates that a single-beat transaction is write-through, 

reflecting the value of the W bit for the block or page that contains 
the address of the current transaction. 

Negated — Indicates that a transaction is not write-through. 

Timing Comments Assertion/Negation — ^The same as A0-A3 1 . 

High Impedance — ^The same as A0-A31. 

7.2A.7 Global (GBL) 

The global (GBL) signal is an input/output signal on the 603e. 
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7.2.4.7.1 Global (GBL)— Output 

Following are the state meaning and timing comments for the GBL output signal. 

State Meaning Asserted— Indicates that a transaction is global, reflecting the setting 

of the M bit for the block or page that contains the address of the 
current transaction (except in the case of copy-back operations and 
instruction fetches, which are nonglobal.) 

Negated — Indicates that a transaction is not global. 

Timing Comments Assertion/Negation — ^The same as A0-A3 1 . 

High Impedance — ^The same as A0-A31. 

7.2.4.7.2 Global (GBL)— Input 

Following are the state meaning and timing comments for the GBL input signal. 

State Meaning Asserted — Indicates that a transaction must be snooped by the 603e. 

Negated — Indicates that a transaction is not snooped by the 603e. 
Timing Comments Assertion/Negation — The same as A0-A3 1 . 

7.2.4.8 Cache Set Entry (CSE0-CSE1)— Output 

Following are the state meaning and timing comments for the CSEO-CSEl signals. 

State Meaning Asserted/Negated — Represents the cache replacement set element 

for the current transaction reloading into or writing out of the cache. 
Can be used with the address bus and the transfer attribute signals to 
externally track the state of each cache line in the 603e’s cache. Note 
that the CSEO-CSEl signals are not meaningful during data cache 
touch load operations. 

Timing Comments Assertion/Negation — The same as A0-A3 1 . 

High Impedance— The same as A0-A31. 

7.2.5 Address Transfer Termination Signais 

The address transfer termination signals are used to indicate either that the address phase 
of the transaction has completed successfully or must be repeated, and when it should be 
terminated. For detailed information about how these signals interact, see Section 8.3.3, 
“Address Transfer Termination.” 



7.2.5.1 Address Acknowiedge (AACK)— Input 

The address acknowledge (AACK) signal is an input signal (input-only) on the 603e. 

Following are the state meaning and timing comments for the AACK signal. 

State Meaning Asserted — ^Indicates that the address phase of a transaction is 

complete. The address bus will go to a high impe dance state on the 
next bus clock cycle. The 603e samples ARTRY on the bus clock 
cycle following the assertion of AACK. 
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Negated — (During ABB) indicates that the address bus and the 
transfer attributes must remain driven. 

Timing Comments Assertion — May occur as early as the bus clock cycle after TS is 
asserted (unless 603e is configured for 1:1 or 1.5:1 clock modes, 
when AACK can be asserted no sooner than the second cycle 
following the assertion of TS — one address wait state); assertion can 
be delayed to allow adequate address access time for slow devices. 
For example, if an implementation supports slo w snoop ing devices, 
an external arbiter can postpone the assertion of AACK. 

Negation — Must occur one bus clock cycle after the assertion of 
AACK. 

7.2.S.2 Address Retry (ARTRY) 

The address retry (ARTRY) signal is both an input and output signal on the 603e. 

7.2.5.2.1 Address Retry (ARTRY) — Output 

Following are the state meaning and timing comments for the ARTRY output signal. 

State Meaning Asserted — ^Indicates that the 603e detects a condition in which a 

snooped address tenure must be retried. If the 603e needs to update 
memory as a result of the s noop th at caused th e retry, the 603e asserts 
BR the second cycle after AACK if ARTRY is asserted. 

High Impedance — ^Indicates that the 603e does not need the snooped 
address tenure to be retried. 

Timing Comments ^sertion — ^Asserted the third bus cycle following the assertion of 
TS if a retry is required. 

Negatio n — Occurs the second bus cycle after the assertion of 
AACK. Since this signal may be simultaneously driven by multiple 
devices, it negates in a unique fashion. First the buffer goes to high 
impedance for a minimum of one-half processor cycle (dependent on 
the clock mode), then it is driven negated for one bus cycle before 
returning to high impedance. 

This special method of negation may be disabled by setting 
precharge disable in HIDO. 
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7.2.5.2 2 Address Retry (ARTRY)— Input 

Following are the state meaning and timing comments for the ARTRY input signal. 

State Meaning Asserted— If the 603e is the address bus master, ARTRY indicates 

that the 603e must retry the preceding address tenure and 
immediately negate BR (if asserted). If the associated data tenure has 
already started, the 603e will also abort the data tenure immediately, 
even if the burst data has been received. If the 603e is not the address 
bus master, this input indicates that the 603e should immed iately 
negate BR for one bus clock cycle following the assertion of ARTRY 
by the snooping bus master to allow an opportunity for a copy-back 
operation to main memory. Note that the subsequent address 
presented on the address bus may not be the same one associated 
with the assertion of the ARTRY signal. 

Negated/High Impedance — Indicates that the 603e does not need to 
retry the last address tenure. 

Timing Comments Assertion — Ma y occur as early as the second cycle following the 
assertion of TS, and must occur b y the bus clock cycle immediately 
following the assertion of AACK if an address retry is required. 

Negation — ^Must occur during the second cycle after the assertion of 
AACK. 

7.2.6 Data Bus Arbitration Signals 

Like the address bus arbitration signals, data bus arbitration signals maintain an orderly 
process for determining data bus mastership. Note that there is no data bus arbitration signal 
equivalent to the address _^s arbitration signal BR (bus request), because, except for 
address-only transactions, TS implies data bus requests. For a detailed description on how 
these signals interact, see Section 8.4.1, “Data Bus Arbitration.” 

One special signal, DBWO, allows the 603e to be configured dynamically to write data out 
of order with respect to read data. For detailed information about using DBWO, see 
Section 8.10, “Using Data Bus Write Only.” 

7.2.6.1 Data Bus Grant (DBG) — Input 

The data bus grant (DBG) signal is an input signal (input-only) on the 603e. Following are 
the state meaning and timing comments for the DBG signal. 

State Meaning Asserted— Indicates that the 603e may, with the proper qualification, 

assume masters hip of the data bus. T he 603e derive s a q ualified d ata 
bus grant when DBG is asserted and DB B, DRT RY, and ARTRY are 
negated; that is, the data bus is not busy (DBB is negat ed), there is 
no outstanding attempt to retry the current data tenure (DR TRY is 
negated), and there is no outstanding attempt to perform an ARTRY 
of the associated address tenure. 

Negated — Indicates that the 603e must hold off its data tenures. 
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Timing Comments Assertion — ^May occur any time to indicate_ti^ 603e is free to take 
data bus mastership. It is not sampled until TS is asserted. 

Negation — May occur at any time to indicate the 603e cannot 
assume data bus mastership. 

7.2.6.2 Data Bus Write Only (DBWO)— Input 

The data bus write only (DBWO) signal is an input signal (input-only) on the 603e. 
Following are the state meaning and timing comments for the DBWO signal. 

State Meaning Asserted — ^Indicates that the 603e may run the data bus tenure for an 

outstanding write address even if a read address is pipelined before 
the write address. Refer to Section 8.10, * *Using D ata Bus Write 
Only,” for detailed instructions for using DBWO. 

Negated — Indicates that the 603e must run the data bus tenures in the 
same order as the address tenures. 

Timing Comments Assertion— Must occur n o later th an a qualified DBG for an 

outstanding write tenur e. DB WO is only recognized by the 603e on 
the clock of a q ualified D BG. If no write requests are pending, the 
603e will ignore DBWO and assume data bus ownership for the next 
pending read request. 

Negation — ^May occur any time after a qualified DBG and before the 
next assertion of DBG. 



7.2.6.S Data Bus Busy (DBB) 

The data bus busy (DBB) signal is both an input and output signal on the 603e. 

7.2.6.3.1 Data Bus Busy (DBB)— Output 

Following are the state meaning and timing comments for the DBB output signal. 

State Meaning Asserted — ^Indicates that the 603e is the data bus master. The 603e 

always assumes data bus mastership if it n eeds the data bus and is 
given a qualified data bus grant (see DBG). 

Negated — Indicates that the 603e is not using the data bus. 

Timing Comments Asser tion — Occurs during the bus clock cycle following a qualified 
DBG. 

Negation — Occurs for a minimum of one-half bus clock cycle 

(dependent on clock mode) following the assertion of the final TA. 

High Impedance — Occurs after DBB is negated. 

7.2.6.3.2 Data Bus Busy (DBB) — Input 

Following are the state meaning and timing comments for the DBB input signal. 

State Meaning Asserted — ^Indicates that another device is bus master. 

Negated — Indica tes tha t the data bus is free (with proper 
qualification, see DBG) for use by the 603e. 
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Timing Comments Assertion — ^Must occur when the 603e must be prevented from using 
the data bus. 

Negation — ^May occur whenever the data bus is available. 

7.2.7 Data Transfer Signals 

Like the address transfer signals, the data transfer signals are used to transmit data and to 
generate and monitor parity for the data transfer. For a detailed description of how the data 
transfer signals interact, see Section 8.4.3, “Data Transfer.” 

7.2.7.1 Data Bus (DH0-DH31,DL0-DL31) 

The data bus (DH0-DH31 and DL0-DL31) consists of 64 signals that are both input and 
output on the 603e. Following are the state meaning and timing comments for the DH and 
DL signals. 

State Meaning The data bus has two halves — data bus high (DH) and data bus low 

(DL). See Table 7-5 for the data bus lane assignments. 

Timing Comments The data bus is driven once for noncached transactions and four 
times for cache transactions (bursts). 

Table 7-5. Data Bus Lane Assignments 



Data Bus Signals 


Byte Lane 


DH0-DH7 


0 


DH8-DH15 


1 


DH16-DH23 


2 


DH24-DH31 


3 


DL0-DL7 


4 


DL8-DL15 


5 


DL16-DL23 


6 


DL24-DL31 


7 



7.2.7.1 .1 Data Bus (DH0-DH31 , DL0-DL31 )— Output 

Following are the state meaning and timing comments for the DH and DL output signals. 

State Meaning Asserted/Negated — Represents the state of data during a data write. 

Byte lanes not selected for data transfer will not supply valid data. 

Timing Comments Assertion/Negation — Initial beat coincides with DBB and, for 

bursts, transitions on the bus clock cycle following each assertion of 
TA. 

High Impedance — Occurs on the bus clock cycle after the final 
assertion of TA. 
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7.2.7.1.2 Data Bus (DH0-DH31, DL0-DL31)— Input 

Following are the state meaning and timing comments for the DH and DL input signals. 

State Meaning Asserted/Negated — Represents the state of data during a data read 

transaction. 

Timing Comments Assertion/Negation — ^Data must be valid on the same bus clock 
cycle that TA is asserted. 

7.2.7.2 Data Bus Parity (DP0-DP7) 

The eight data bus parity (DP0-DP7) signals on the 603e are both output and input signals. 

7.2.7.2.1 Data Bus Parity (DP0-DP7)— Output 

Following are the state meaning and timing comments for the DP output signals. 

State Meaning Asserted/Negated — Represents odd parity for each of 8 bytes of data 

write transactions. Odd parity means that an odd number of bits, 
including the parity bit, are driven high. The signal assignments are 
listed in Table 7-6. 

Timing Comments Assertion/Negation — The same as DL0-DL3 1 . 

High Impedance — ^The same as DL0-DL31. 

Table 7-6. DP0-DP7 Signal Assignments 



Signal Name 


Signal Assignments 


DPO 


DH0-DH7 


DP1 


DH8-DH15 


DP2 


DH16-DH23 


DP3 


DH24-DH31 


DP4 


DL0-DL7 


DPS 


DL8-DL15 


DP6 


DL16-DL23 


DP7 


DL24-DL31 



7.2.7.2.2 Data Bus Parity (DP0-DP7)— Input 

Following are the state meaning and timing comments for the DP input signals. 

State Meaning Asserted/Negated — Represents odd parity for each byte of read data. 

Parity is checked on all data byte lanes, regardless of the size of the 
transfer. Detected even parity causes a check stop if data parity errors 
are enabled in the HIDO register. (See DPE.) 

Timing Comments Assertion/Negation — ^The same as DL0-DL3 1 . 
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7.2.7.3 Data Parity Error (DPE)— Output 

The data parity error (DPE) signal is an output signal (output-only) on the 603e. Note that 
the (DPE) signal is an open-drain type output, and requires an extern al pull-up resistor (for 
example, 10 kf2 to Vdd) to assure proper de-as sertio n of the (DPE) signal. Following are 
the state meaning and timing comments for the DPE signal. 

State Meaning Asserted — ^Indicates incorrect data bus parity. 

Negated — Indicates correct data bus parity. 

Timing Comments Assertion — Occurs on the second bus clock cycle after TA is 
asserted to the 603e, unless TA is cancelled by an assertion of 
DRTRY. 

High Impedance — Occurs on the third bus clock cycle after TA is 
asserted to the 603e. 



7.2.7.4 Data Bus Disable (DBDIS)— Input 

The Data Bus Disable (DBDIS) signal is an input signal (input-o nly) on the 603e. 
Following are the state meanings and timing comments for the DBDIS signal. 

State Meaning Asserted — ^Indicates (for a write transaction) that the 603e must 

release data bus and the data bus parity to high imped ance d uring the 
following cycle. The data tenure will remain active, DBB will 
remain driven, and the transfer termination signals will still be 
monitored by the 603e. 

Negated — Indicates the data bus should remain normally driven. 
DBDIS is ignored during read transactions. 

Timing Comments Assertion/Negation — ^May be asserted on any clock cycle when the 
603e is driving, or will be driving the data bus; may remain asserted 
multiple cycles. 

7.2.8 Data Transfer Termination Signals 

Data termination signals are required after each data beat in a data transfer. Note that in a 
single-beat transaction, the data termination signals also indicate the end of the tenure, 
while in burst accesses, the data termination signals apply to individual beats and indicate 
the end of the tenure only after the final data beat. 

For a detailed description of how these signals interact, see Section 8.4.4, “Data Transfer 
Termination.” 

7.2.8.1 Transfer Acknowledge (TA) — Input 

The transfer acknowledge (TA) signal is an input signal (input-only) on the 603e. 
Following are the state meaning and timing comments for the TA signal. 

State Meaning Asserted — Indicates that a single-beat data transfer completed 

successfully or that a data be at in a burst transfer completed 
successfully (unless DRTRY is asserted on the next bus clock cycle). 
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Note that TA must be asserted for each data beat in a burst 
transaction, and must be asserted during assertion of DRTRY. For 
more information, see Section 8.4.4, “Data Transfer Termination.” 

Negated — (During DBB) indicates that, until TA is asserted, the 
603e must continue to drive the data for the current write or must 
wait to sample the data for reads* 

Timing Comments Assertion — ^Must not occur before AACK for the current transaction 
(if the address retry mechanism is to be used to prevent invalid data 
from being used by the processo r); oth erwise, assertion may occur at 
any time during the assertion of DBB. The system can withhold 
assertion of TA to indicate that the 603e should insert wait states to 
extend the duration of the data beat. 

Negation — ^Must occur after the bus clock cycle of the final (or only) 
data beat of the transfer. For a burst transfer, the system can assert TA 
for one bus clock cycle and then negate it to advance the burst 
transfer to the next beat and insert wait states during the next beat. 
(Note: when the 603e is configured for 1:1 clock mode and is 
performing a burst read into the data cache, the 603e requires one 
wait state between the assertion of TS and the first assertion of TA 
for that transaction. If no-DRTRY mode is also selected, the 603e 
requires two wait states for 1:1 clock mode, or 1 wait state for 1.5:1 
clock mode.) 

7.2.8,2 Data Retry (DRTRY)— Input 

The data retry (DRTRY) sign al is inpu t only on the 603e. Following are the state meaning 
and timing comments for the DRTRY signal. 

State Meaning Asserted — Indicates that the 603e must invalidate the data from the 

previous read operation. 

Negated — Indicates that dat a presente d with TAon the previous read 
operation is valid. Note that DRTRY is ignored for write 
transactions. 

Timing Comments Assertion — ^Must occur during the bus c lock cycl e immediately after 
TA is asserted if a retry is required. The DRT RY signa l may be held 
asserted for multiple bus clock cycles. When DRTRY is negated, 
data must have been valid on the previous clock with TA asserted. 

Negation — ^Must occur during the bus c lock c ycle after a valid data 
beat. This may occur several cycles after DBB is negated, effectively 
extending the data bus tenure. 

Start-up — T he DRTR Y signal is sampled at the negation of 
HRESET ; if DRTRY is asserte d, No-DR TRY mode is selected. If 
DRTRY is negated at start-up, DRTRY is enabled. 
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7.2.8.3 Transfer Error Acknowledge (TEA) — Input 

The transfer error acknowledge (TEA) signal is input only on the 603e. Following are the 
state meaning and timing comments for the TEA signal. 

State Meaning Asserted — Vindicates that a bus error occurred. Causes a machine 

check exception (and possibly causes the processor to enter 
checkstop state if machine check enable bit is cleared 
(MSR[ME] = 0)). For more information, see Section 4.S.2.2, 
“Checkstop State (MSR[ME] = 0)/’ Ass ertion ter minates the current 
transaction; that i s, assertion of TA and DRTRY are ign ored. T he 
assertion of TEA causes the negation/high impedance of DBB in the 
next clock cycle. However, data entering the GPR or the cache are 
not invalidated. (Note that the term, ‘exception,’ is also referred to as 
‘interrupt’ in the architecture specification.) 

Negated — Indicates that no bus error was detected. 

Timing Comments Assertion — ^May be asserted whi le DB B is asserted, and the cycle 
after TA during a read operation. TEA should be asserted for one 
cycle only. 

Negation — TEA must be negated no later than the negation of DBB. 

7.2.9 System Status Signals 

Most system status signals are input signals that indicate when exceptions are received, 
when checkstop conditions have occurred, and when the 603e must be reset. The 603e 
generates the output signal, CKSTP_OUT, when it detects a checkstop condition. For a 
detailed description of these signals, see Section 8.7, “Interrupt, Checkstop, and Reset 
Signals.” 

7.2.9.1 Interrupt (INT)— Input 

The interrupt (I NT) s ignal is input only. Following are the state meaning and timing 
comments for the INT signal. 

State Meaning Asserted — ^The 603e initiates an interrupt if MSR[EE] is set; 

otherwise, the 603e ignores the inte rrupt . To guarantee that the 603e 
will take the external interrupt, the INT signal must be held active 
until the 603e takes the interrupt; otherwise, whether the 603e takes 
an extern al int errupt, depends on whether the MSR[EE] bit was set 
while the INT signal was held active. 

Negated — Indicates that normal operation should proceed. See 
Section 8.7.1, “External Interrupts.” 

Timing Comments Assertion — ^May occur at any time and may be asserted 

asynchronously to the input clocks. The INT input is level-sensitive. 
Negation— Should not occur until interrupt is taken. 
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7.2.9.2 System Management Interrupt (SMI) — Input 

The system management interrupt (SMI) signal is input only. Following are the state 
meaning and timing comments for the SMI signal. 

State Meaning Asserted — ^The 603e initiates a system management interrupt 

operation if the MSR[EE] is set; otherwise, t he 60 3e ignores the 
exception condition. The 603e must hold the SMI signal active until 
the exception is taken. 

Negated — Indicates that normal operation should proceed. See 
Section 8.7.1, “External Interrupts.” 

Timing Comments Assertion — ^May occur at any time and may b e asserted 

asynchronously to the input clocks. The SMI input is level-sensitive 

Negation — Should not occur until interrupt is taken. 

7.2.9.3 Machine Check Interrupt (MCP)— Input 

The machine check interrupt (MCP) si gnal is input only on the 603e. Following are the state 
meaning and timing comments for the MCP signal. 

State Meaning Asserted — ^The 603e initiates a machine check interrupt operation if 

MSR[ME] and HID0[EMCP] are set; if MSR[ME] is cleared and 
HID0[EMCP] is set, the 603e must terminate operati on by internally 
gating off all clocks, and releasing all outputs (except CKSTP_OUT) 
to the high impedance state. If HID O[EMC P] is cleared, the 603e 
ignores the interrupt condition. The MCP signal must be held 
asserted for 2 bus clock cycles. 

Negated — Indicates that normal operation should proceed. See 
Section 8.7.1, “External Interrupts.” 

Timing Comments Assertion — ^May occur at any time and may b e asserted 

asynchronously to the input clocks. The MCP input is negative edge- 
sensitive. 

Negation — ^May be negated 2 bus cycles after assertion. 

7.2.9.4 Checkstop Input (CKSTPJN) — Input 

The checkstop input (CKSTPJN) signal is input only on the 603e. Following are the state 
meaning and timing comments for the CKSTPJN signal. 

State Meaning Asserted — Indicates that the 603e must terminate operation by 

internally gatin g off all clocks, and release all out puts (except 
CKSTP_OUT) to the high impedance state. Once CKSTPJN has 
been asserted it must remain asserted until the system has been reset. 

Negated — ^Indicates that normal operation should proceed. See 
Section 8.7.2, “Checkstops.” 



Chapter 7. Signal Descriptions 



7-23 




Timing Comments Assertion — ^May occur at any time and may be asserted 
asynchronously to the input clocks. 

Negation — ^May occur any time after the CKSTP_OUT output signal 
has been asserted. 

7.2.9.5 Checkstop Output (CKSTP_OUT)— Output 

The checkstop output (CKSTP OUT) signal is output only on the 603e. Note that the 
CKSTP_OUT signal is an open-drain type output, and requires an e xternal pull-up resistor 
(for example, 10 kQ to Vdd) to assure proper de-assertion of the CKSTP^ OUT signal. 
Following are the state meaning and timing comments for the CKSTP_OUT signal. 

State Meaning Asserted — Indicates that the 603e has detected a checkstop 

condition and has ceased operation. 

Negated — Indicates that the 603e is operating normally. 

See Section 8.7.2, “Checkstops.” 

Timing Comments Assertion — ^May occur at any time and may be asserted 
asynchronously to the 603e input clocks. 

Negation — Is negated upon assertion of HRESET. 

7.2.9.6 Reset Signals 

There are two reset signals on the 603e — hard reset (HRESET) and soft reset (SRESET). 
Descriptions of the reset signals are as follows: 

7.2.9.6.1 Hard Reset (HRESET)— Input 

The hard reset (HRESET) signal is input only and must be used at power-on t o properly 
reset the processor. Following are the state meaning and timing comments for the HRESET 
signal. 

State Meaning Asserted — ^Initiates a complete hard reset operation when this input 

transitions from asserted to negated. Causes a reset exception as 
described in Section 4.5. 1.1, “Hard Reset and Power-On Reset.” 
Output drivers are re leased to h igh impedance within five clocks 
after the assertion of HRESET. 

Negated — Indicates that normal operation should proceed. See 
Section 8.7.3, “Reset Inputs.” 

Timing Comments Assertion — ^May occur at any time and may be asserted 

asynchronously to the 603e input clock; must be held asserted for a 
minimum of 255 clock cycles after the PLL lock time has been met. 
Refer to the PowerPC 603 e RISC Microprocessor Hardware 
for further timing comments. 

Negation — ^May occur any time after the minimum reset pulse width 
has been met. 

This input has additional functionality in certain test modes. 
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7.2.9.6.2 Soft Reset (SRESET)— Input 

The soft reset (SRESET) signal is input only. Following are the state meaning and timing 
comments for the SRESET signal. 

State Meaning Asserted — Initiates processing for a reset exception as described in 

Section 4.5. 1.2, “Soft Reset.” 

Negated — Indicates that normal operation should proceed. See 
Section 8.7.3, “Reset Inputs.” 

Timing Comments Assertion — ^May occur at any time and may be asserte d 

asynchronously to the 603e input clock. The SRESET input is 
negative edge-sensitive. 

Negation — May be negated 2 bus cycles after assertion. 

This input has additional functionality in certain test modes. 

7.2.9.7 Processor Status Signals 

Processor status signals indicate the state of the processor. This includes the memory 
reservation signal, machine quiesce control signals, time base enable signal, and TLBI 
Sync signal. 

7.2.9.7.1 Quiescent Request (QREQ) 

The quiescent request (QREQ) signal is output only. Following are the state meaning and 
timing comments for the QREQ signal. 

State Meaning Asserted — Indicates that the 603e is requesting all bus activity 

normally required to be snooped to terminate or to pause so the 603e 
may enter a quiescent (low power) state. Once the 603e has entered 
a quiescent state, it no longer snoops bus activity. 

Negated — Indicates that the 603e is not making a request to enter the 
quiescent state. 

Timing Comments Assertion/Negation — ^May occur on any cycle. QREQ will remain 
asserted for the duration of the quiescent state. 

7.2.9.7.2 Quiescent Acknowledge (QACK) 

The quiescent acknowledge (QACK) signal is input only. Following are the state meaning 
and timing comments for the QACK signal. 

State Meaning Asserted— Indicates that all bus activity that requires snooping has 

terminated or paused, and that the 603e may enter the quiescent (or 
low power) state. 

Negated — ^Indicates that the 603e may not enter a quiescent state, 
and must continue snooping the bus. 
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Timing Comments Assertion/ Negation — ^May occur on any cycle following the 

assertion of QREQ, and must be held asserted for a minimum of one 
bus clock cycle. 

Start-Up — QACK is sa mpled at the negation of HRESET to select 
reduced-pinout mode; if QACK is asserted at start-up, reduced- 
pinout mode is disabled. 

7.2.9.7.3 Reservation (RSRV)— Output 

The reservation (RSRV) signal is output only on the 603e. Following are the state meaning 
and timing comments for the RSRV signal. 

State Meaning Asserted/Negated — Represents the state of the reservation 

coherency bit in the reservation address register that is used by the 
Iwarx and stwcx. instructions. See Section 8.8.1, “Support for the 
Iwarx/stwcx. Instruction Pair.” 

Timing Comments Assertion/Negation — Occurs synchronously with respect to bus 

clock cycles. The execution of an Iwarx instruction sets the internal 
reservation condition. 

7.2.9.7.4 Timebase Enable (TBEN) — Input 

The timebase enable (TBEN) signal is input only on the 603e. Following are the state 
meanings and timing comments for the TBEN signal. 

State Meaning Asserted — ^Indicates that the timebase should continue clocking. 

This input is essentially a “count enable” control for the timebase 
counter. 

Negated — Indicates the timebase should stop clocking. 

Timing Comments Assertion/Negation — ^May occur on any cycle. 

7.2.9.7.5 TLBI Sync (TLBISYNC) 

The TLBI Sync (TLBISYNC) signal is input only on the 603e. Following are the state 
meanings and timing comments for the TLBISYNC signal. 

State Meaning Asserted — ^Indicates that instruction execution should stop after 

execution of a tlbsync instruction. 

Negated — Indicates that the instruction execution may continue or 
resume after the completion of a tlbsync instruction. 

Timing Comments Assertion/Negation — ^May occur on any cycle. 

Start-Up — TLBISYNC is sa mpled at the n egation of HRESET to 
select 32-bit data bus mode; if TLBISYNC is negated at start-up, 32- 
bit mode is disabled and the default 64-bit mode is selected. 
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7.2.10 COP/Scan Interface 

The 603e has extensive on-chip test capability including the following: 

• Built-in instruction and data cache self test (BIST) 

• Debug control/observation (COP) 

• Boundary scan (IEEE 1149.1 compliant interface) 

• LSSD test control 

The BIST hardware is not exercised as part of the POR sequence. The COP and boundary 
scan logic are not used under typical operating conditions. 

Detailed discussion of the 603e test functions is beyond the scope of this document; 
however, sufficient information has been provided to allow the system designer to disable 
the test functions that would impede normal operation. 

The COP/scan interface is shown in Figure 7-2. For more information, see Section 8.9, 
“IEEE 1 149. 1-Compliant Interface.” 



TDI (Test Data Input) 
TMS (Test Mode Select) 
TCK (Test Clock input) 
TDO (Test Data Output) 
TRST (Test Reset) 



Figure 7-2. IEEE 11 49.1 -Compliant Boundary Scan Interface 

7.2.11 Pipeline Tracking Support 

The 603e provides for nonintrusive instruction pipeline tracking. Setting the HID0[EICE] 
bit causes the address parity and data parity signals to be redefined as outputs providing 
pipeline tracking information. These signals toggle at the CPU clock rate and will have 
special loading and timing requirements when in this mode. Table 7-7 shows the outputs 
when HID0[EICE] is set. 
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Table 7-7. Pipeline Tracking Outputs 



Bit(s) 


Function 


Encoding 


DP[0-1] 


Fetch 


00 


None 






01 


Two 






10 


One 






11 


Branch 


DP[2-3] 


Retire 


00 


None 






01 


Two 






10 


One 






11 


Exception 


DP[4-5] 


Fold 


00 


None 






01 


First 






10 


Second 






11 


Both 


DP[6-7] 


Prediction 


00 


Nonspec 






01 


Spec_2nd 






10 


Spec_both 






11 


Flush_spec 


AP[0-3] 


FEA 


FEA[20-23] 



Given the object code, these signals provide sufficient information to track instruction 
execution (except for register indirect branches). Register indirect branches may be tracked 
either by examining and matching potential target streams (nonintrusive but not always 
resolvable), or by forcing register indirect branch targets to be fetched externally by setting 
HIDO[FBIOB]. 

Setting HID0[EICE] also enables the processor clock to the CLK_OUT signal which 
provides a synchronizing clock to the pipeline tracking outputs. 

7.2.12 Clock Signals 

The clock signal inputs of the 603e determine the system clock frequency and provide a 
flexible clocking scheme that allows the processor to operate at an integer multiple of the 
system clock frequency. 

Refer to the PowerPC 603e RISC Microprocessor Hardware Specifications for exact 
timing relationships of the clock signals. 

7.2.12.1 System Clock (SYSCLK)— Input 

The 603e requires a single system clock (SYSCLK) input. This input sets the frequency of 
operation for the bus interface. Internally, the 603e uses a phase-lock loop (PLL) circuit to 
generate a master clock for all of the CPU circuitry (including the bus interface circuitry) 
which is phase-locked to the SYSCLK input. The master clock may be set to an integer or 
half-integer multiple (1:1, 1.5:1, 2:1, 2.5:1, 3:1, 3.5:1 or 4:1) of the SYSCLK frequency 
allowing the CPU core to operate at an equal or greater frequency than the bus interface. 



7-28 



PowerPC 603e RISC Microprocessor User's Manual 























State Meaning Asserted/Negated — The SYSCLK input is the primary clock input 

for the 603e, and represents the bus clock frequency for 603e bus 
operation. Internally, the 603e may be operating at an integer or half- 
integer multiple of the bus clock frequency. 

Timing Comments Duty cycle — Refer to the PowerPC 603e RISC Microprocessor 
Hardware Specifications for timing comments. 

Note: SYSCLK is used as the frequency reference for the internal 
PLL clock generator, and must not be suspended or varied during 
normal operation to ensure proper PLL operation. 

7.2.12.2 Test Clock (CLK^OUT)— Output 

The test clock (CLK_OUT) signal is an output signal (output-only) on the 603e. Following 
are the state meaning and timing comments for the CLK_OUT signal. 

State Meaning Asserted/Negated — Provides PLL clock output for PLL testing and 

monitoring. The CLK_OUT signal clocks at either the processor 
clock frequency, the bus clock frequency, or the half-bus clock 
frequency if enabled by the appropriate bits in the HIDO register; the 
default state of the CLK__OUT signal is high-impedance. The 
CLK__OUT signal is provided for testing purposes only. 

Timing Comments Assertion/Negation — ^Refer to the PowerPC 603e RISC 

Microprocessor Hardware Specifications for timing comments. 

7.2.12.3 PLL Configuration (PLL_CFG0-PLL^CFG3)— Input 

The PLL (phase-lock loop) is configured by the PLL_CFG0-PLL_CFG3 signals. For a 
given SYSCLK (bus) frequency, the PLL configuration signals set the internal CPU 
frequency of operation. 

Following are the state meaning and timing comments for the PLL_CFG0-PLL_CFG3 
signals. 

State Meaning Asserted/Negated — Configures the operation of the PLL and the 

internal processor clock frequency. Settings are based on the desired 
bus and internal frequency of operation. 

Timing Comments Assertion/Negation — ^Must remain stabl e during o peration; should 
only be changed during the assertion of HRESET or during sleep 
mode. These bits may be read through bits PC0-PC3 in the HIDl 
register. 
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Table 7-8. PLL Configuration 



PLL.CFGO 

to 

PLL_CFG3 


Bus, CPU and PLL Frequencies | 


CPU/ 

SYSCLK 

Ratio 


Bus 

16.6 MHz 


Bus 
20 MHz 


Bus 
25 MHz 


Bus 

33.3 MHz 


Bus 
40 MHz 


Bus 
50 MHz 


Bus 

66.6 MHz 


0000 


1:1 


— 


— 


■ — 


— 


— 


— 


66.6 

(133) 


0001 


1:1 


— 


— 


— 


33.3 

(133) 


40 

(160) 


50 

(200) 


— 


0010 


1:1 


16.6 

(133) 


20 

(160) 


25 

(200) 




— - 


— 


— , 


1100 


1.5:1 


— 


— 


— 


— 


— 


75 

(150) 


100 

(200) 


0100 


2:1 


— 


— 


— 


66.6 

(133) 


80 

(160) 


100 

(200) 


— 


0101 


2:1 


33.3 

(133) 


40 

(160) 


50 

(200) 


— 


— 


— 


— 


0110 


2.5:1 


— 


H 


— 


83.3 

(166) 


100 

(200) 


— 


— 


1000 


3:1 


— 


— 


75 

(150) 


100 

(200) 


— 


— 


— 


1110 


3.5:1 


— 


70 

(140) 


87.5 

(175) 


— 


— 


— 


— 


1010 


4:1 


66.6 

(133) 


80 

(160) 


100 

(200) 


— 


— 


— 


— 


0011 


PLL Bypass 


1111 


Clock Off 



Notes: 



1. Some PLL configurations may select bus, CPU, or PLL frequencies which are not useful, not 
supported, or not tested for by the 603e. For complete Information, see the 603e hardware 
specifications for timing comments. PLL frequencies (shown In parenthesis in the table above) 
should not fall below 133 MHz, and should not exceed 200 MHz. 

2. In PLL-bypass mode, the SYSCLK input signal clocks the internal processor directly, and the bus is 
set for 1:1 mode operation. In clock-off mode, no clocking occurs inside the 603e regardless of the 
SYSCLK input. 
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7.2.13 Power and Ground Signals 

The 603e provides the following connections for power and ground: 

• VDD and OVDD — The VDD and OVDD signals provide the connection for the 
supply voltage. On the 603e, there is no electrical distinction between the VDD and 
the OVDD signals. 

• AVDD — The AVDD power signal provides power to the clock generation phase- 
lock loop. See the PowerPC 603e RISC Microprocessor Hardware Specifications 
for information on how to use this signal. 

• GND and OGND — ^The GND and OGND signals provide the connection for 
grounding the 603e. On the 603e, there is no electrical distinction between the GND 
and OGND signals. 
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Chapter 8 

System Interface Operation 

This chapter describes the PowerPC 603e microprocessor bus interface and its operation. 
It shows how the 603e signals, defined in Chapter 7, “Signal Descriptions,” interact to 
perform address and data transfers. 

8.1 PowerPC 603e Microprocessor System Interface 
Overview 

The system interface prioritizes requests for bus operations from the instruction and data 
caches, and performs bus operations per the 603e bus protocol. It includes address register 
queues, prioritization logic, and bus control unit. The system interface latches snoop 
addresses for snooping in the data cache and in the address register queues, snoops for 
direct-store reply operations and for reservations controlled by the Load Word and Reserve 
Indexed (Iwarx) and Store Word Conditional Indexed (stwcx.) instructions, and maintains 
the touch load address for the cache. The interface allows one level of pipelining; that is, 
with certain restrictions discussed later, there can be two outstanding transactions at any 
given time. Accesses are prioritized with load operations preceding store operations. 

Instructions are automatically fetched from the memory system into the instruction unit 
where they are dispatched to the execution units at a peak rate of three instructions per 
clock. Conversely, load and store instructions explicitly specify the movement of operands 
to and from the integer and floating-point register files and the memory system. 

When the 603e encounters an instruction or data access, it calculates the logical address 
(effective address in the architecture specification) and uses the low-order address bits to 
check for a hit in the on-chip, 16-Kbyte instruction and data caches. During cache lookup, 
the instruction and data memory management units (MMUs) use the higher-order address 
bits to calculate the virtual address, from which they calculate the physical address (real 
address in the architecture specification). The physical address bits are then compared with 
the corresponding cache tag bits to determine if a cache hit occurred. If the access misses 
in the corresponding cache, the physical address is used to access system memory. 

In addition to the loads, stores, and instruction fetches, the 603e performs software table 
search operations following TLB misses, cache cast-out operations when least-recently 
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used cache lines are written to memory after a cache miss, and cache-line snoop push-out 
operations when a modified cache line experiences a snoop hit from another bus master. 

Figure 8-1 shows the address path from the execution units and instruction fetcher, through 
the translation logic to the caches and system interface logic. 

The 603e uses separate address and data buses and a variety of control and status signals 
for performing reads and writes. The address bus is 32 bits wide and the data bus can be 
configured to be 32 or 64 bits wide. The interface is synchronous — all 603e inputs are 
sampled at and all outputs are driven from the rising edge of the bus clock. The bus can run 
at the full processor-clock frequency or at an integer division of the processor-clock speed. 
While the 603e operates at 3.3 volts, all the I/O signals are 5.0 volt TTL-compatible. 

8.1.1 Operation of the Instruction and Data Caches 

The 603e provides independent instruction and data caches. Each cache is a physically- 
addressed, 16-Kbyte cache with four-way set associativity. Both caches consist of 128 sets 
of four cache lines, with eight words in each cache line. 

Because the data cache on the 603e is an on-chip, write-back primary cache, the 
predominant type of transaction for most applications is burst-read memory operations, 
followed by burst-write memory operations, direct-store operations, and single-beat 
(noncacheable or write-through) memory read and write operations. Additionally, there can 
be address-only operations, variants of the burst and single-beat operations (global memory 
operations that are snooped, and atomic memory operations, for example), and address 
retry activity (for example, when a snooped read access hits a modified line in the cache). 

Since the 603e data cache tags are single ported, simultaneous load or store and snoop 
accesses cause resource contention. Snoop accesses have the highest priority and are given 
first access to the tags, unless the snoop access coincides with a tag write, in which case the 
snoop is retried and must re-arbitrate for access to the cache. Loads or stores that are 
deferred due to snoop accesses are performed on the clock cycle following the snoop. 

The 603e supports a three-state coherency protocol that supports the modified, exclusive, 
and invalid (MEI) cache states. The protocol is a subset of the MESI 
(modified/exclusive/shared/invalid) four-state protocol and operates coherently in systems 
that contain four-state caches. With the exception of the dcbz instruction, the 603e does not 
broadcast cache control instructions. The cache control instructions are intended for the 
management of the local cache but not for other caches in the system. 

Cache lines in the 603e are loaded in four beats of 64 bits each (or eight beats of 32 bits 
each when operating in 32-bit bus mode). The burst load is performed as “critical double 
word first.” The cache that is being loaded is blocked to internal accesses until the load 
completes (that is, no hits under misses). The critical double word is simultaneously written 
to the cache and forwarded to the requesting unit, thus minimizing stalls due to load delays. 
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64-BIT 




32-BIT ADDRESS BUS 



32-/64-BIT DATA BUS 



Figure 8-1. PowerPC 603e Microprocessor Block Diagram 
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Cache lines are selected for replacement based on an LRU (least recently used) algorithm. 
Each time a cache line is accessed, it is tagged as the most recently used line of the set. 
When a miss occurs, if both lines in the set are marked as valid, the least recently used line 
is replaced with the new data. When data to be replaced is in the modified state, the 
modified data is written into a write-back buffer while the missed data is being read from 
memory. When the load completes, the 603e then pushes the replaced line from the write- 
back buffer to main memory in a burst write operation. 

8.1.2 Operation of the System Interface 

Memory accesses can occur in single-beat (1-8 bytes) and four-beat (32 bytes) burst data 
transfers when the 603e is configured with a 64-bit data bus. When the 603e is in the 
optional 32-bit data bus mode, memory accesses can occur in single-beat (1 to 4 bytes), 
two-beat (8 bytes), and eight-beat (32 bytes) bursts. The address and data buses are 
independent for memory accesses to support pipelining and split transactions. The 603e can 
pipeline as many as two transactions and has limited support for out-of-order split-bus 
transactions. 

Access to the system interface is granted through an external arbitration mechanism that 
allows devices to compete for bus mastership. This arbitration mechanism is flexible, 
allowing the 603e to be integrated into systems that implement various fairness and bus- 
parking procedures to avoid arbitration overhead. 

Typically, memory accesses are weakly ordered — sequences of operations, including 
load/store string and multiple instructions, do not necessarily complete in the order they 
begin — maximizing the efficiency of the bus without sacrificing coherency of the data. The 
603e allows load operations to precede store operations (except when a dependency exists). 
In addition, the 603e can be configured to reorder high-priority store operations ahead of 
lower-priority store operations. Because the processor can dynamically optimize run-time 
ordering of load/store traffic, overall performance is improved. 

Note that the Synchronize (sync) instruction can be used to enforce strong ordering. 

The following sections describe how the 603e interface operates, providing detailed timing 
diagrams that illustrate how the signals interact. A collection of more general timing 
diagrams are included as examples of typical bus operations. 

Figure 8-2 is a legend of the conventions used in the timing diagrams. 

This is a synchronous interface — all 603e input signals are sampled and output signals are 
driven on the rising edge of the bus clock cycle (see the PowerPC 603e RISC 
Microprocessor Hardware Specifications for exact timing information). 
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Bar over signal name indicates active low 


apO 


603e input (while 603e is a bus master) 


IH 


603e output (while 603e is a bus master) 


ADDR^t- 


603e output (grouped; here, address plus attributes) 


qual BG 


603e internal signal (inaccessible to the user, but used in 
diagrams to clarify operations) 


e 


Compelling dependency — event will occur on the 
next clock cycle 




Prerequisite dependency — event will occur on an 
undetermined subsequent clock cycle 


dZ) 


603e three-state output or Input 




603e nonsampled input 




Signal with sample point 




A sampled condition (dot on high or low state) 
with multiple dependencies 


\ ' 


Timing for a signal had It been asserted (it is not 
actually asserted) 



Figure 8-2. Timing Diagram Legend 

8.1. 2.1 Optional 32-Bit Data Bus Mode 

The 603e supports an optional 32-bit data bus mode. The 32-bit data bus mode operates the 
same as the 64-bit data bus mode with the exception of the byte lanes involved in the 
transfer and the number of data beats that are performed. The number of data beats required 
for a data tenure in the 32-bit data bus mode is one, two, or eight beats depending on the 
size of the program transaction and the cache mode for the address. For additional 
information about 32-bit data bus mode, see Section 8.6.1, “32-Bit Data Bus Mode.” 
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8.1.3 Direct-Store Accesses 

The 603e does not support the extended transfer protocol for accesses to the direct-store 
storage space. The transfer protocol used for any given access is selected by the T bit in the 
MMU segment registers; if the T bit is set, the memory access is a direct-store access. An 
attempt to access to a direct-store segment will result in the 603e taking a DSI exception. 

8.2 Memory Access Protocol 

Memory accesses are divided into address and data tenures. Each tenure has three phases — 
bus arbitration, transfer, and termination. The 603e also supports address-only transactions. 
Note that address and data tenures can overlap, as shown in Figure 8-3. 

Figure 8-3 shows that the address and data tenures are distinct from one another and that 
both consist of three phases — arbitration, transfer, and termination. Address and data 
tenures are independent (indicated in Figure 8-3 by the fact that the data tenure begins 
before the address tenure ends), which allows split-bus transactions to be implemented at 
the system level in multiprocessor systems. Figure 8-3 shows a data transfer that consists 
of a single-beat transfer of as many as 64 bits. Four-beat burst transfers of 32-byte cache 
lines require data transfer termination signals for each beat of data. 



ADDRESS TENURE 




INDEPENDENT ADDRESS AND DATA 



\ 


DATA TENURE 

yv 




r 






ARBITRATION 


SINGLE-BEAT TRANSFER 


TERMINATION 



Figure 8-3. Overlapping Tenures on the PowerPC 603e Microprocessor Bus for a 

Single-Beat Transfer 

The basic functions of the address and data tenures are as follows: 

• Address tenure 

— Arbitration: During arbitration, address bus arbitration signals are used to gain 
mastership of the address bus. 

— Transfer: After the 603e is the address bus master, it transfers the address on the 
address bus. The address signals and the transfer attribute signals control the 
address transfer. The address parity and address parity error signals ensure the 
integrity of the address transfer. 
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— Termination: After the address transfer, the system signals that the address tenure 
is complete or that it must be repeated. 

• Data tenure 

— Arbitration: To begin the data tenure, the 603e arbitrates for mastership of the 
data bus. 

— Transfer: After the 603e is the data bus master, it samples the data bus for read 
operations or drives the data bus for write operations. The data parity and data 
parity error signals ensure the integrity of the data transfer. 

— Termination: Data termination signals are required after each data beat in a data 
transfer. Note that in a single-beat transaction, the data termination signals also 
indicate the end of the tenure, while in burst accesses, the data termination 
signals apply to individual beats and indicate the end of the tenure only after the 
final data beat. 

The 603e generates an address-only bus transfer during the execution of the dcbz 
instruction, which uses only the address bus with no data transfer involved. Additionally, 
the 603e’s retry capability provides an efficient snooping protocol for systems with 
multiple memory systems (including caches) that must remain coherent. 

8.2.1 Arbitration Signals 

Arbitration for both address and data bus mastership is performed by a central, external 
arbiter and, minimally, by the arbitration signals shown in Section 7.2.1, “Address Bus 
Arbitration Signals.” Most arbiter implementations require additional signals to coordinate 
bus m aster/slave/snooping activities. Note that address bus busy (ABB) and data bus busy 
(DBB) are bidirectional signals. These signals are inputs unless the 603e has mastership of 
one or both of the respective buses; they must be connected high through pull-up resistors 
so that they remain negated when no devices have control of the buses. 

The following list describes the address arbitration signals: 

• BE (bus request) — ^Assertion indicates that the 603e is requesting mastership of the 
address bus. 

• BT? (bus grant) — ^Assertion indicates that the 603e may, with the proper 
qualification, assume ma stersh ip o f the addr ess bus. A qualified bus grant occurs 
when BG is asserted and ABB and ARTRY are negated. 

If the 603e is parked, BR need not be asserted for the qualified bus grant. 

• Abb (address bus busy) — Assertion by the 603e indicates that the 603e is the 
address bus master. 
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The following list describes the data arbitration signals: 

• DEG (data bus grant) — Indicates that the 603e may, with the proper quali ficatio n, 
assume master ship of the data bus . A qualifi ed data bus grant occurs when DBG is 
asserted while DBB, DRTRY, and ARTRY are negated. 

The DB B signal i s driven by the current bus master, DRTRY is only driven from the 
bus, and ARTRY is from the bus, but only for the address bus tenure associated with 
the current data bus tenure (that is, not from another address tenure). 

• DBWn (data bus write only) — ^Assertion indicates that the 603e may perform the 
data bus tenure for an outst anding w rite address even if a read address is pipelined 
before the write address. If DB WO is asserted, the 603e will assume data bus 
mastership for a pending data bus write operation; the 603e will tak e the data bus for 
a pending read operation if this in put is as serted along with DBG and no write is 
pending. Care must be taken with DBWO to ensure the desired write is queued (for 
example, a cache-line snoop push-out operation). 

• DBB (data bus busy) — ^Assertion by the 603e indicates that the 603e is the data bus 
master. The 603e always assumes da ta bus mastership if it needs the data bus and is 
given a qualified data bus grant (see DBG). 

For more detailed information on the arbitration signals, refer to Section 7.2.1, 
“Address Bus Arbitration Signals,” and Section 7.2.6, “Data Bus Arbitration 
Signals.” 

8.2.2 Address Pipelining and Split-Bus Transactions 

The 603e protocol provides independent address and data bus capability to support 
pipelined and split-bus transaction system organizations. Address pipelining allows the 
address tenure of a new bus transaction to begin before the data tenure of the current 
transaction has finished. Split-bus transaction capability allows other bus activity to occur 
(either from the same master or from different masters) between the address and data 
tenures of a transaction. 

While this capability does not inherently reduce memory latency, support for address 
pipelining and split-bus transactions can greatly improve effective bus/memory 
throughput. For this reason, these techniques are most effective in shared-memory 
multiprocessor implementations where bus bandwidth is an important measurement of 
system performance. 

External arbitration is required in systems in which multiple devices must compete for the 
system b^. The design of the extern al arbiter affects pipelining by regulatin g address bus 
grant (BG), data bus grant (DBG), and address acknowledge (AACK) signals. For 
example, a one-level pipeline is enabled by asserting AACK to the current address bus 
master and granting mastership of the address bus to the next requesting master before the 
current data bus tenure has completed. Two address tenures can occur before the current 
data bus tenure completes. 
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The 603e can pipeline its own transactions to a depth of one level (intraprocessor 
pipelining); however, the 603e bus protocol does not constrain the maximum number of 
levels of pipelining that can occur on the bus between multiple masters (interprocessor 
pipelining). The external arbiter must control the pipeline depth and synchronization 
between masters and slaves. 

In a pipelined implementation, data bus tenures are kept in strict order with respect to 
address tenures. However, external hardware can further decouple the address and data 
buses, allowing the data tenures to occur out of order with respect to the address tenures. 
This requires some form of system tag to associate the out-of-order data transaction with 
the proper originating address transaction (not defined for the 603e interface). Individual 
bus requests and data bus grants from each processor can be used by the system to 
implement tags to support interprocessor, out-of-order transactions. 

The 603e supports a li mited int raprocessor out-of-order, split-transaction cap ability vi a the 
data bus write only (DBWO) signal. For more information about using DBWO, see 
Section 8.10, “Using Data Bus Write Only.” 

8.3 Address Bus Tenure 

This section describes the three phases of the address tenure — address bus arbitration, 
address transfer, and address termination. 

8.3.1 Address Bus Arbitration 

When the 603e needs access to the external bus and it is not parked (BG is negated), it 
asserts bus request (BR) until it is granted mastership of the bus and the bus is available 
(see Figure 8-4). The external arbiter must grant master-elect status to the potential master 
by asserting the bus grant (BG) signal. The 603e requesting the bus determines that the bus 
is available whe n the ABB input is negated. When the address bus is not busy (ABB input 
is negated), BG is asserted and the address retry (ARTRY) input is negated. This is referred 
to as a qualifi ed bus grant. The potential master assumes address bus mastership by 
asserting ABB when it receives a qualified bus grant. 
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External arbiters must allow only one device at a time to be the address bus master. 
Implementations in which no other device can be a master, BG can be grounded (always 
asserted) to continually grant mastership of the address bus to the 603e. 

If the 603e asserts BR before the external arbiter asserts BG, the 603e is considered to be 
unparked, as shown in Figure 8-4. Figure 8-5 shows the parked case, where a qualified bus 
grant exists on the clock edge following a need_bus condition. Notice that the bus clock 
cycle required for arbitration is eliminated if the 60 3e is p arked, reducing overall memory 
laten cy for a transaction. The 603e always negates ABB for at least one bus clock cycle 
after AACK is asserted, even if it is parked and has another transaction pending. 

Topically, bus parking is provided to the device that was the most recent bus master; 
however, system designers may choose other schemes such as providing unrequested bus 
grants in situations where it is easy to correctly predict the next device requesting bus 
mastership. 
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Figure 8-5. Address Bus Arbitration Showing Bus Parking 

When the 603e receives a qualified bus grant, it assumes address bus mastership by 
asserting ABB and negating the BR output signal. Meanwhile, the 603e drives the address 
for the requested access onto the address bus and asserts TS to indicate the start of a new 
transaction. 

When designing external bus arbitration logic, note that the 603e may assert BR without 
using the bus after it receives the qualified bus grant. For example, in a system using bus 
snooping, if the 603e asserts BR to perform a replacement copy-back operation, another 
device can invalidate that line before the 603e is granted mastership of the bus. Once the 
603e is granted the bus, it no longer needs to perform the copy-back operation; therefore, 
the 603e does not assert ABB and does not use the bus for the copy-back operation. Note 
that the 603e asserts BR for at least one clock cycle in these instances. 

8.3.2 Address Transfer 

During the address transfer, the physical address and all attributes of the transaction are 
transferred from the bus master to the slave device(s). Snooping logic may monitor the 
transfer to enforce cache coherency; see discussion about snooping in Section 8.3.3, 
“Address Transfer Termination.” 
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The signals used in the address transfer include the following signal groups: 

• Address transfer start signal: Transfer start (TS) 

• Address transfer sig nals: A ddress bus (A0-A3 1), address parity (AP0-AP3), and 
address parity error (APE) 

• Address transfer attribute signals: Transfer type ( TT0-TT 4), transfer code^TCO- 
TCl), tra nsfe r size (TSIZ0-TSIZ2), transfer burst (TEST), cache inhibit (Cl), write- 
through (WT), global (GEL), and cache set element (CSEO-CSEl) 

Figure 8-6 shows that the timing for all of these signals, except TS and APE, is identical. 
All of the address transfer and address transfer attribute signals are combined into the 
ADDR+ grouping in Figure 8-6. The TS signal indicates that the 603e has begun an address 
transfer and that the address and transfer attributes are vali d (wi thin the context of a 
synchronous bus). The 603e always asserts TS coincident with AEE. As an input, TS need 
not coincide with the assertion of AEE on the bus (that is, TS can be asserted with, or on, 
a subsequent clock cycle after AEE is asserted; the 603e tracks this transaction correctly). 



0 I 1 I 2 I 3 I 4 




In Figure 8-6, the address transfer occurs during bus clock cycles 1 and 2 (arbitration 
occurs in bus clock cycle 0 and the addres s transf er is terminated in bus clock 3). In this 
diagram, the address bus termination input, AACK, is asserted to the 603e on the bus clock 
following assertion of TS (as shown by the dependency line). This is the minimum duration 
of the add ress transfer for the 603e; the duration can be extended by delaying the assertion 
of AACK for one or more bus clocks. 
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8.3.2.1 Address Bus Parity 

The 603e always generates 1 bit of correct odd-byte parity for each of the 4 bytes of address 
when a valid address is on the bus. The calculated values are placed on the AP0-AP3 
outputs when the 603e is the address bus master. If the 603e is not the master and TS and 
GBL are asserted together (qualified condition for snooping memory operations), the 
calculated values are compared with the AP0-AP3 inputs. If there is an error, and address 
parity checking is enabled (HID0[EBA] set to 1), the APE output is asserted. An address 
bus parity error causes a checkstop condition if MSR[ME] is cleared to 0. For more 
information about checkstop conditions, see Chapter 4, “Exceptions.” 

8.3.2.2 Address Transfer Attribute Signals 

The transfer attribute signals include several encoded signals such as the transfer type 
(TT0-TT4) signals, transfer burst (TEST) signal, transfer size (TSIZ0-TSIZ2) signals, and 
transfer code (TCO-TCl) signals. Section 7.2.4, “Address Transfer Attribute Signals,” 
describes the encodings for the address transfer attribute signals. 

8.3.2.2.1 Transfer Type (TT0-TT4) Signals 

Snooping logic should fully decode the transfer type signals if the GBL signal is asserted. 
Slave devices can sometimes use the individual transfer type signals without fully decoding 
the group. For a complete description of the encoding for transfer type signals TT0-TT4, 
refer to Table 8-1 and Table 8-2. 

8.3.2.2.2 Transfer Size (TSIZ0-TSIZ2) Signals 

The transfer size signals (TSIZ0-TSIZ2) indicate the size of the requeste d data transfer as 
shown in Table 8-1. The TSIZ0-TSIZ2 signals may be used along with TEST and A29- 
A31 to determine which portion of the data bus contains valid data for a write transaction 
or which portion of the bus should contain valid data for a read transaction. Note that for a 
burst transaction (as indicated by the assertion of TEST), TSIZ0-TSIZ2 are always set to 
ObOlO. Therefore, if the TEST signal is asserted, the memory system should transfer a total 
of eight words (32 bytes), regardless of the TSIZO-TSIZ2 encoding. 



Table 8-1. Transfer Size Signal Encodings 



TBST 


TSIZO 


TSIZ1 


TSIZ2 


Transfer Size 


Asserted 


0 


1 


0 


Eight-word burst 


Negated 


0 


0 


0 


Eight bytes 


Negated 


0 


0 


1 


One byte 


Negated 


0 


1 


0 


Two bytes 


Negated 


0 


1 


1 


Three bytes 


Negated 


1 


0 


0 


Four bytes 


Negated 


1 


0 


1 


Five bytes (N/A) 
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Table 8-1. Transfer Size Signal Encodings (Continued) 



TiST 


TSIZO 


TSIZ1 


TSIZ2 


Transfer Size 


Negated 


1 


1 


0 


Six bytes (N/A) 


Negated 


1 


1 


1 


Seven bytes (N/A) 



The basic coherency size of the bus is defined to be 32 bytes (corresponding to one cache 
line). Data transfers that cross an aligned, 32-byte boundary either must present a new 
address onto the bus at that boundary (for coherency consideration) or must operate as 
noncoherent data with respect to the 603e. The 603e never generates a bus transaction with 
a transfer size of 5 bytes, 6 bytes, or 7 bytes. 

S.3.2.3 Burst Ordering During Data Transfers 

During burst data transfer operations, 32 bytes of data (one cache line) are transferred to or 
from the cache in order. Burst write transfers are always performed zero double word first, 
but since burst reads are performed critical double word first, a burst read transfer may not 
start with the first double word of the cache line, and the cache line fill may wrap around 
the end of the cache line. This section describes the burst ordering for the 603e when 
operating in either the 64- or 32-bit bus mode. 

Table 8-2 describes the burst ordering when the 603e is configured with a 64-bit data bus. 



Table 8-2. Burst Ordering— 64-Bit Bus 



Data Transfer 


For Starting Address: 


A27-A28 = 00 


A27-A28 = 01 


A27-A28 = 10 


A27-A28 = 11 


First data beat 


DWO 


DW1 


DW2 


DW3 


Second data beat 


DW1 


DW2 


DW3 


DWO 


Third data beat 


DW2 


DW3 


DWO 


DW1 


Fourth data beat 


DW3 


DWO 


DW1 


DW2 



Note: A29-A31 are always ObOOO for burst transfers by the 603e. 



Table 8-3 describes the burst ordering when the 603e is configured with a 32-bit bus. 
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Table 8-3. Burst Ordering— 32-Bit Bus 



Data Transfer 


For Starting Address: 


A27-A28 = 00 


A27-A28 = 01 


A27-A28 = 10 


A27-A28 = 11 


First data beat 


DWO-U 


DW1-U 


DW2-U 


DW3-U 


Second data beat 


DWO-L 


DW1-L 


DW2-L 


DW3-L 


Third data beat 


DW1-U 


DW2-U 


DW3-U 


DWO-U 


Fourth data beat 


DW1-L 


DW2-L 


DW3-L 


DWO-L 


Fifth data beat 


DW2-U 


DW3-U 


DWO-U 


DW1-U 


Sixth data beat 


DW2-L 


DW3-L 


DWO-L 


DW1-L 


Seventh data beat 


DW3-U 


DWO-U 


DW1-U 


DW2-U 


Eighth data beat 


DW3-L 


DWO-L 


DW1-L 


DW2-L 



Notes: A29-A31 are always ObOOO for burst transfers by the 603e. 



“U” and “L” represent the upper and lower word of the double word respectively. 



8.S.2.4 Effect of Alignment in Data Transfers (64-Bit Bus) 

Table 8-4 lists the aligned transfers that can occur on the 603e bus when configured with a 
64-bit width. These are transfers in which the data is aligned to an address that is an integer 
multiple of the size of the data. For example. Table 8-4 shows that 1-byte data is always 
aligned; however, for a 4-byte word to be aligned, it must be oriented on an address that is 
a multiple of 4. 




Table 8-4. Aligned Data Transfers (64-Bit Bus) 



Transfer Size 


TSiZO 


TSIZI 


TSIZ2 


A29-A31 


Data Bus Byte Lane(s) 


0 


1 


2 


3 


4 


5 


6 


7 


Byte 


■I 


0 


1 


000 


V 


— 


— 


— 


— 




— 


— 


0 


0 


1 


001 


— 


V 


— 


— 


— 


— 


— 


— 


0 


0 


1 


010 


— 


— 


V 


— 


— 


— 


— 


— 


0 


0 


1 


oil 


— 


— 


— 


V 


— 


— 


— 


— 


0 


0 


1 


100 


— 


— 


— 


— 


V 


— 


— 


— 


0 


0 


1 


101 


— 


— 


— 


— 


— 


V 


— 


— 


0 


0 


1 


110 


— 


— 


— 


— 


— 


— 


< 


— 


0 


0 


■■ 


111 


— 


— 


— 


— 


— 


— 


— 
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Table 8-4. Aligned Data Transfers (64-Bit Bus) (Continued) 



Transfer Size 


TSIZO 


TSIZ1 


TSIZ2 


A29-A31 


Data Bus Byte Lane(s) 


0 


1 


2 


3 


4 


5 


6 


7 


Half word 


0 


1 


0 


000 


V 


V 


— 


— 


— 


— 


— 


— 


0 


1 


0 


010 


— 


— 


V 


V 


— 


— 


— 


— 


0 


1 


0 


100 


— 


— 


— 


— 


yl 


V 


— 


— 


0 


1 


0 


110 


— 


— 


— 


— 


— 


— 


V 


V 


Word 


1 


0 


0 


000 


V 


V 


V 


V 


— 


— 




— 


1 


0 


0 


100 


— 


— 




— 


V 


V 


V 


V 


Double word 


0 


0 


0 


000 


V 


V 


V 


V 


V 


V 


V 


V 



Notes: These entries indicate the byte portions of the requested operand that are read or written during 
that bus transaction. 

These entries are not required and are ignored during read transactions and are driven with 
undefined data during all write transactions. 



The 603e supports misaligned memory operations, although their use may substantially 
degrade performance. Misaligned memory transfers address memory that is not aligned to 
the size of the data being transferred (such as, a word read of an odd byte address). 
Although most of these operations hit in the primary cache (or generate burst memory 
operations if they miss), the 603e interface supports misaligned transfers within a word (32- 
bit aligned) boundary, as shown in Table 8-5. Note that the 4-byte transfer in Table 8-5 is 
only one example of misalignment. As long as the attempted transfer does not cross a word 
boundary, the 603e can transfer the data on the misaligned address (for example, a half- 
word read from an odd byte-aligned address). An attempt to address data that crosses a 
word boundary requires two bus transfers to access the data. 

Due to the performance degradations associated with misaligned memory operations, they 
are best avoided. In addition to the double-word straddle boundary condition, the address 
translation logic can generate substantial exception overhead when the load/store multiple 
and load/store string instructions access misaligned data. It is strongly recommended that 
software attempt to align code and data where possible. 
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Table 8-5. Misaligned Data Transfers (Four-Byte Examples) 



Transfer Size 


TSIZ(0-2) 


A29-A31 


Data Bus Byte Lanes 


(Four Bytes) 


D 


D 


2 


3 


B 


B 


B 


B 


Aligned 


100 


000 


A 


D 


D 


D 


— 


— 


— 


— 


Misaligned— first access 


01 1 


001 




D 


D 


D 


— 


— 


— 


— 


second access 


001 


1 00 


— 


— 


B 


— 


B 


B 


B 


— 


Misaligned— first access 


010 


010 


— 


— 


D 


D 




- 




— 


second access 


01 1 


100 


— 


— 


— 


— 


B 


B 


B 


B 


Misaligned— first access 


001 


01 1 


— 


— 


— 


A 




- 




— 


second access 


01 1 


100 


— 


— 


— 


— 


B 


B 


B 


— 


Aligned 


100 


100 


— 


— 


— 


— 


D 


D 


D 


B 


Misaligned— first access 


01 1 


101 


— 


— 


— 


— 




D 


D 


B 


second access 


001 


000 


D 


B 


— 


— 


B 


B 


B 


B 


Misaligned— first access 


010 


1 1 0 


— 




— 


— 


— 


B 


D 


B 


second access 


010 


000 


D 


B 


— 


— 


— 


B 


B 


B 


Misaligned— first access 


001 


1 1 1 


— 




— 


— 


— 


— 


— 


A 


second access 


01 1 


000 


A 


B 


B 


— 


— 


— 


— 


— 



Notes: 



A: Byte lane used 
— Byte lane not used 

8.3.2.S Effect of Alignment in Data Transfers (32-Bit Bus) 

The aligned data transfer cases for 32-bit data bus mode are shown in Table 8-6. All of the 
transfers require a single data beat (if caching-inhibited or write-through) except for 
double- word cases which require two data beats. The double-word case is only generated 
by the 603e for load or store double operations to/from the floating-point GPRs. All 
caching-inhibited instruction fetches are performed as word operations. 
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Table 8-6. Aligned Data Transfers (32-Bit Bus Mode) 



Transfer Size 


TSIZO 


TSIZ1 


TSIZ2 


A29-A31 


Data Bus Byte Lane(s) 


0 


1 


2 


3 


4 


5 


6 


7 


Byte 


0 


0 


1 


000 


A 


— 


— 


— 


X 


X 


X 


X 




0 


0 


1 


001 


-- 


A 


X 


— 


X 


X 


X 


X 




0 


0 


mm 


010 


— 


— 


A 


— 


X 


X 


X 


X 




0 


■i 


n 


oil 


— 


— 


— 


A 


X 


X 


X 


X 




■i 


■■ 


n 


100 


D 


— 


— 


— 


X 


X 


X 


X 




0 


0 


1 


101 


— 


A 


— 


— 


X 


X 


X 


X 




■■ 


0 


1 


110 


— 


— 


A 


— 


X 


X 


X 


X 




0 


0 


1 


111 


— 


— 


— 


a 


X 


B 


X 


X 


Half word 


0 


1 


0 


000 


A 


A 


— 


— 


n 


B 


X 


X 




0 


1 


0 


010 


— 


— 


D 


D 


X 


X 


X 


X 




0 


1 


0 


100 


D 


D 


— 


— 


X 


X 


X 


X 




0 


1 


0 


110 


— 


— 


D 


D 


X 


X 


X 


X 


Word 


1 


■■ 


0 


000 


A 


A 


D 


D 


X 


X 


X 


B 




1 


0 


0 


100 


A 


A 


D 


A 


X 


X 


X 


X 


Double word 


0 


0 


0 


000 


A 


A 


A 


A 


X 


X 


X 


X 


Second beat 


0 


a 


0 


000 


D 


D 


D 


D 


D 


D 


B 


X 



Notes: 



A: Byte lane used 

— ; Byte lane not used 

x: Byte lane not used in 32-bit bus mode 

Misaligned data transfers when the 603e is configured with a 32-bit data bus operate in the 
same way as when configured with a 64-bit data bus, with the exception that only the DHO- 
DH31 data bus is used. See Table 8-7 for an example of a 4-byte misaligned transfer 
starting at each possible byte address within a double word. 
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Table 8-7. Misaligned 32-Bit Data Bus Transfer (Four-Byte Examples) 



Transfer Size 


TSIZ(0-2) 


A29-A31 


Data Bus Byte Lanes 


(Four Bytes) 


D 


B 


2 


3 


B 


B 


B 


B 


Aligned 


1 00 


000 


a 


D 


D 


D 


D 


D 


B 


B 


Misaligned — ^first access 


01 1 


001 




D 


D 


D 


D 


O 


B 


B 


second access 


001 


100 


A 


— 






B 


B 


B 


B 


Misaligned— -first access 


01 0 


010 


— 


— 


D 


D 


D 


D 


B 


B 


second access 


010 


1 00 


B 


B 


B 


B 


B 


B 


B 


B 


Misaligned— first access 


001 


01 1 


— 






D 


D 


D 


B 


B 


second access 


01 1 


1 00 


B 


B 


B 




B 


B 


B 


B 


Aligned 


100 


1 00 


D 


D 


D 


D 


D 


D 


B 


B 


Misaligned— first access 


01 1 


101 


— 


D 


D 


D 


D 


B 


B 


B 


second access 


001 


000 


B 


B 


B 


B 


B 


B 


B 


B 


Misaligned — ^first access 


01 0 


1 1 0 


— 




D 


D 


D 


B 


B 


B 


second access 


010 


000 


B 


B 


— 




B 


B 


B 


B 


Misaligned— first access 


001 


1 1 1 


— 




— 


D 


D 


B 


B 


B! 


second access 


01 1 


000 


B 


B 


B 


- 


B 


B 


B 


Bi 



Notes: 



A: Byte lane used 

— : Byte lane not used 

x: Byte lane not used in 32-bit bus mode 

8.3.2.5.1 Alignment of External Control instructions 

The size of the data transfer associated with the eciwx and ecowx instructions is always 
4bytes. However, if the eciwx or ecowx instruction is misaligned and crosses any word 
boundary, the 603e will generate two bus operations, each with a size of fewer than 4 bytes. 
For the first bus operation, bits A29-A31 equal bits 29-31 of the effective address of the 
instruction, which will be OblOl, ObllO, or Oblll. The size associated with the first bus 
operation will be 3, 2, or 1 bytes, respectively. For the second bus operation, bits A29-A31 
equal ObOOO, and the size ass ociated with the operation will be 1 , 2, or 3 bytes, respectively. 
For both operations, TEST and TSIZ0-TSIZ2 are redefined to specify the resource ID 
(RID). The resource ID is copied from bits 28-3 1 of the EAR. For eciwx/ecowx operations, 
the state of b it 28 o f the EAR is presented by the TEST signal without inversion (if 
EAR[28] = 1, TEST = 1). The size of the second bus operation cannot be deduced from the 
operation itself; the system must determine how many bytes were transferred on the first 
bus operation to determine the size of the second operation. 
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Furthermore, the two bus operations associated with such a misaligned external control 
instruction are not atomic. That is, the 603e may initiate other types of memory operations 
between the two transfers. Also, the two bus operations associated with a misaligned ecowx 
may be interrupted by an eciwx bus operation, and vice versa. The 603e does guarantee that 
the two operations associated with a misaligned ecowx will not be interrupted by another 
ecowx operation; and likewise for eciwx. 

Because a misaligned external control address is considered a programming error, the 
system may choose to assert TEA or otherwise cause an exception when a misaligned 
external control bus operation occurs. (The term exception is referred to interrupt in the 
architecture specification.) 

8.S.2.6 Transfer Code (TC0-TC1) Signals 

The TCO and TCI signals provide supplemental information about t he corr esponding 
address. Note that the TCx signals can be used with the TT0-TT4 and TEST signals to 
further define the current transaction. 

Table 8-8 shows the encodings of the TCO and TCI signals. 



Table 8-8. Transfer Code Encoding 



TC0-TC1 


Read 


Write 


00 


Data transaction 


Any write 


01 


Touch load 


N/A 


1 0 


Instruction fetch 


N/A 


1 1 


(Reserved) 


N/A 



8.3.3 Address Transfer Termination 

The add ress tenure of a bus operation i s termina ted when completed with the assertion of 
AACK, or retrie d with t he assertion of ARTRY. The 603e does not terminate the address 
transfer until the AACK (address acknowledge) input is asserted; therefore, the system can 
extend the address transfer phase by delaying the assertion of AACK to the 603e. Although 
AACK can be asserted as early as the bus clock cycle following TS (see Figure 8-7), which 
allows a minimum address tenure of two bus cycles when the 603e clock is configured for 
1 : 1 (processor clock to bus clock) mode, the ARTRY snoop response cannot be determined 
in the minimum allowed address tenure period. When in 1:1 or 1.5:1 clock mode, AACK 
must not be asserted until the third clo ck of the address tenure (one address wait state) to 
allow the 603e an opportunity to asse rt ARTRY on the bus. For other clock configurations 
(2:1, 2.5:1, 3:1, 3.5:1, and 4:1), the ARTRY snoop response can be determined in the 
minimum address tenure period, and AACK may be asserted as early as the second bus 
clock of the address tenure. As shown in Figure 8-7, these signals are asserted for one bus 
clock cycle, three-stated for half of the next bus clo ck cycle, driven high till the following 
bus cycle, and finally three-stated. Note that AACK must be asserted for only one bus clock 
cycle. 
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The address transfer can be terminated with the requirement to retry if ARTRY is asserted 
anytime during the address tenure and through the cycle following AACK. The assertion 
causes the entire transaction (address and data tenure) to be rerun. As a snooping device, 
the 603e asserts ARTRY for a snooped transaction that hits modified data in the data cache 
that must be written back to memory, or if the snooped transaction could not be serviced. 
As a bus master, the 603e responds to an assertion of ARTRY by aborting the bus 
transaction and re-requesting the bus. Note that after recognizing an assertion of ARTRY 
and aborting the transaction in progress, the 603e is not guaranteed to run the same 
transaction the next time it is granted the bus due to internal reordering of load and store 
operations. 

If an address retry is required, the ARTRY response will be asserted by a bus snooping 
device as early as the second cycle after the assertion of TS (or until the third cycle 
following TS if 1: 1 or 1 .5 : 1 processor to bus clock ratio is selected). Once asserted, ARTRY 
must remain asserted through the cycle after the assertion of AACK. The assertion of 
ARTRY during the cycle after the assertion of AACK is referred to as a qualified ARTRY. 
An earlier assertion of ARTRY during the address tenure is referred to as an early ARTRY. 

As a bus master, the 603e recognizes either an early or qualified ARTRY and prevents the 
data tenure associated with the retried address tenure. If the data tenure has already begun, 
the 603e aborts and terminates the data tenure immediately even if the burst data has been 
received. If the assertion of ARTRY is received up to or on the bus cycle following the first 
(or only) assertion of TA for the data tenure, the 603e ignores the first data beat, and if it is 
a load operation, does not forward data internally to the cache and execution units. If 
ARTRY is asserted after the first (or only) assertion of TA, improper operation of the bus 
interface may result. 

During the clock of a qualified ARTRY, the 603e also determines if it should negate BR and 
ignore BG on the following cycle. On the following cycle, only the snooping master that 
asserted ARTRY and needs to perform a snoop copy -back operation is allowed to assert 
BR. This guarantees the snooping master an opportunity to request and be granted the bus 
before the just-retried master can restart its transaction. Note that a nonclocked bus arbiter 
may detect the assertion of address bus request by the bus master that asserted ARTRY, and 
return a qualified bus grant one cycle earlier than shown in Figure 8-7. 
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8.4 Data Bus Tenure 

This section describes the data bus arbitration, transfer, and termination phases defined by 
the 603e memory access protocol. The phases of the data tenure are identical to those of the 
address tenure, underscoring the symmetry in the control of the two buses. 

8.4.1 Data Bus Arbitration 

Data bus arbitration uses the d ata arbitration signal group — ^DBG, DBWO, and DBB. 
Additionally, the combination of TS and TT0-TT4 provides information about the data bus 
request to external logic. 

The TS signal is an implied data bus request from the 603e; the arbiter must qualify TS with 
the transfer type (TT) encodings to determine if the current address transfer is an address- 
only operation, which does not require a data bus transfer (see Figure 8-7). If the data bus 
is needed, the arbiter grants data bus mastership by asserting the DBG input to the 603e. As 
with the address bus arbitration phase, the 603e must qualify the DBG input with a number 
of input signals before assuming bus mastership, as shown in Figure 8-8. 
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Figure 8-8. Data Bus Arbitration 



A qualified data bus grant can be expressed as the following: 

QDBG = DBG asserted while DBB, DRTRY, and ARTRY (associated with the data 
bus operation) are negated. 

When a data tenure overlaps with its associated address tenure, a qualified ARTRY 
assertion coincident with a data bus grant signal does not result in data bus mastership 
(DBB is not asserted). Otherwise, the 603e always asserts DBB on the bus clock cycle after 
recognition of a qualified data bus grant. Since the 603e can pipeline transactions, there 
may be an outstanding data bus transaction when a new address transaction is retried. In 
this case, the 603e becomes the data bus master to complete the previous transaction. 



8 



8.4.1 .1 Using the DBB Signal 

The DBB signal should be connected between masters if data tenure scheduling is left to 
the masters. Optionally, the memory system can control data tenure scheduling directly 
with DBG. However, it is possible to ignore the DBB signal in the system if the DBB input 
is not used as the final data bus allocation control between data bus masters, and if the 
memory system can track the start and end of the data tenure. If DBB is not used to signal 
the end of a data tenure, DBG is only asserted to the next bus master the cycle before the 
cycle that the next bus master may actually begin its data tenure, rather than asserting it 
earlier (usually during another master’s data tenure) and allowing the negation of DBB to 
be the final gating signal for a qualified data bus grant. Even if DBB is ignored in the 
system, the 603e always recognizes its own assertion of DBB, and requires one cycle after 
data tenure completion to negate its own DBB before recognizing a qualified data bus grant 
for another data tenure. If DBB is ignored in the system, it must still be connected to a pull- 
up resistor on the 603e to ensure proper operation. 
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8.4.2 Data Bus Write Only 

As a result of address pipelining, t he 60 3e may have up to two data tenures queued to 
perform when it receives a qualified DBG. Generally, the data tenures should be performed 
in strict order (the same order) as their address tenures were performed. The 603e, ho wever, 
also supports a limited out-of-order capabilit y with the data bus write only (DBWO) input. 
When recognized on the clock of a qualified DBG, DBWO may direct the 603e to perform 
the next pending data write tenure even if a pending read te nure wou ld have normally been 
performed first. For more information on the operation of DBWO, refer to Section 8.10, 
“Using Data Bus Write Only.” 

If the 603e has any data tenures to perform, it always accepts data bus mastership to 
perform a data tenure when it recognizes a qualified DBG. If DBWO is asserted with a 
qualified DBG and no write tenure is queued to run, the 603e still takes mastership of the 
data bus to perform the next pending read data tenure. 

Generally, DBWO should only be used to allow a copy-back operation (burst write) to 
occur before a pending read operation. If DBWO is used for single-beat write operations, 
it may negate the effect of the eieio instruction by allowing a write operation to precede a 
program-scheduled read operation. 

8.4.3 Data Transfer 

The data transfer signals include DH0-DH31, DL0-DL31, DP0-DP7 and DPE. For 
memory accesses, the DH and DL signals form a 64-bit data path for read and write 
operations. 

The 603 transfers data in either single- or four-beat burst transfers when configured with a 
64-bit data bus; when configured with a 32-bit data bus, the 603 performs one-, two-, and 
eight-beat data transfers. Single-beat operations can transfer from 1 to 8 bytes at a time and 
can be misaligned; see Section 8.3.2.4, “Effect of Alignment in Data Transfers (64-Bit 
Bus).” Burst operations always transfer eight words and are aligned on eight-word address 
boundaries. Burst transfers can achieve significantly higher bus throughput than single-beat 
operations. 

The type of transaction initiated by the 603e depends on whether the code or data is 
cacheable and, for store operations whether the cache is considered in write-back or write- 
through mode, which software controls on either a page or block basis. Burst transfers 
support cacheable operations only; that is, memory structures must be marked as cacheable 
(and write-back for data store operations) in the respective page or block descriptor to take 
advantage of burst transfers. 
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The 603e output TEST indicates to the system whether the current transaction is a single- 
or four-beat transfer (except during eciwx/ecowx transactions, when it signals the state of 
EAR[28]). A burst transfer has an assumed address order. For load or store operations that 
miss in the cache (and are marked as cacheable and, for stores, write-back in the MMU), 
the 603e uses the double- word-aligned address associated with the critical code or data that 
initiated the transaction. This minimizes latency by allowing the critical code or data to be 
forwarded to the processor before the rest of the cache line is filled. For all other burst 
operations, however, the cache line is transferred beginning with the oct- word-aligned data. 

The 603e does not directly support dynamic interfacing to subsystems with less than a 64- 
bit data path. It does, however, provide a static 32-bit data bus mode; for more information, 
see Section 8. 1 .2. 1 , “Optional 32-Bit Data Bus Mode.” 

8.4.4 Data Transfer Termination 

Four signals are used to terminate data bus transactions — ^TA, DRTRY (data retry), TEA 
(transfer error acknowledge), and ARTRY. The TA signal indicates normal termination of 
data transactions. It must always be asserted on the bus cycle coincident with the data that 
it is qualifying. It may be withheld by the slave for any number of clocks until valid data is 
ready to be supplied or accepted. DRTRY indicates invalid read data in the previous bus 
clock cycle. DRTRY extends the current data beat and does not terminate it. If it is asserted 
after the last (or only) data beat, the 603e negates DBB but still considers the data beat 
active and waits for another assertion of TA. DRTRY is ignored on write operations. TEA 
indicates a nonrecoverable bus error event. Upon receiving a final (or only) termination 
condition, the 603e always negates DBB for one cycle. 

If DRTRY is asse rted by the memory system to extend the last (or only) data beat past the 
negation of DBB, the memory system should three-state the data bus on the clock after the 
final assertion of TA, even though it will negate DRTRY on that clock. This is to prevent a 
potential momentary data bus conflict if a write access begins on the following cycle. 

The TEA signal is used to signal a nonrecoverable error during the data transaction. It may 
be asserted on any cycle during DBB, or on the cycle after a qualified TA during a read 
operation, except when no-DRTRY mode is selected (where no-DRTRY mode cancels 
checking the cycle after TA). The assertion of TEA terminates the data tenure immediately 
even if in the middle of a burst; however, it does not prevent incorrect data that has just been 
ackno wledged with TA from being written into the 603e’s cache or GPRs. The assertion of 
TEA initiates either a machine check exception or a checkstop condition based on the 
setting of the MSR. 

An assertion of ARTRY causes the data tenure to be terminated immediately if the ARTRY 
is for the address tenure associated with the data tenure in operation. If ARTRY is 
connected for the 603e, the earliest allowable assertion of TA to the 603e is directly 
dependent on the earliest possible assertion of ARTRY to the 603e; see Section 8.3.3, 
“Address Transfer Termination,” 
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If the 603e clock is configured for 1:1 or 1.5:1 (processor clock to bus clock ratio) mode 
and the 603e is performing a burst read into its data cache, at least one wait state must be 
pr ovided be tween the assertion of TS and the first assertion of TA for that transaction. If 
no-DRTRY mode is also selected, at least two wait states must be provided. The wait states 
are required due to possible resource contention in the data cache caused by a block 
replacement (or cast-out) required in connection with the new linefill. These waits states 
may be provided by withholding the assertion of TA to the 603e for that data tenure, or by 
withholding DBG to the 603e thereby delaying the start of the data tenure. This restriction 
applies only to burst reads into the data cache when configured in 1 : 1 or 1 .5: 1 clock modes. 
(It does not apply to instruction fetches, write operations, noncachable read operations, or 
non-l:l or 1.5:1 clock modes.) 

8.4.4.1 Normal Single-Beat Termination 

Normal termination of a sin gle-beat data read operation occurs when TA is asserted by a 
responding slave. The TEA and DRTRY signals must remain negated during the transfer 
(see Figure 8-9). 



0 I 1 I 2 I 3 I 4 I 




The DRTRY signal is not sampled during data writes, as shown in Figure 8-10. 
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Figure 8-10. Normal Single-Beat Write Termination 



Normal termination of a burst transfer occurs when TA is asserted for four bus clock cycles, 
as shown in Figure 8-11. The bus clock cycles in which TA is asserted need not be 
consecutive, thus allowing pacing of the data transfer beats. For read bursts to terminate 
succe ssfully, TEA and DRTRY must remain negat ed during the transfer. For write bursts, 
TEA must remain negated for a successful transfer. DRTRY is ignored during data writes. 




|1|2|3|4|5|6|7| 
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For read bursts, DRTRY may be asserted one bus clock cycle after TA is asserted to signal 
th at the data presented with TA is invalid and that the processor must wait for the negation 
of DRTRY before forwarding data to the processor (see Figure 8-12). Thus, a data beat can 
be terminated by a p redicted b ranc h with TA and then one bus clock cycle later confirmed 
with the negation of DRTRY. The DRTRY signal is valid only for read transactions. TA 
must be asserted on the bus clock cycle before the first bus clock cycle of the assertion of 
DRTRY; otherwise the results are undefined. 

The DRTRY signal ext ends data bus mastership such that other processors c annot use the 
data bus until DRTRY is negated. Therefore, in the example in Figure 8-12, DBB cannot 
be asserted until bus clock cycle 5. This is true for both read and write operations even 
though DRTRY does not extend bus mastership for write operations. 



I 1 I 2 I 3 I 4 I 5 I 







Figure 8-Jl3^ shows the effect of using DRTRY during a burst read. It also shows the effect 
of using TA to pace the data transfer rate. Notice that in bus clock cycle 3 of Figure 8-13, 
TA is negated for the second data beat. The 603e data pipeline does not proceed until bus 
clock cycle 4 when the TA is reasserted. 

Note that DRTRY is useful for systems that implement predicted forwarding of data such 
as those with direct-mapped, second-level caches where hit/miss is determined on the 
following bus clock cycle, or for parity- or ECC-checked memory systems. 

Note that DRTRY may not be implemented on other PowerPC processors. 
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S.4.4.2 Data Transfer Termination Due to a Bus Error 

The TEA signal indicates that a bus error occurred. It may be asserted while DBB (and/or 
DRTRY for read operations) is asserted. Asserting TEA to the 603e terminates the 
transaction; that is, further assertions of TA and DRTRY are ignored and DBB is negated; 
see Figure 8-13. 



|1 |2|3j4|5|6|7|8|9| 




Assertion of the TEA signal causes a machine check exception (and possibly a check-stop 
condition within the 603e). For more information, see Section 4.5.2, “Machine Check 
Exception (0x00200).” Note also that the 603e does not implement a synchronous error 
capability for memory accesses. This means that the exception instruction pointer does not 
point to the memory operation that caused the assertion of TEA, but to the instruction about 
to be executed (perhaps several instructions later). However, assertion of TEA does not 
invalidate data entering the G PR or the cache. Additionally, the corresponding address of 
the access that caused TEA to be asserted is not latched by the 603e. To recover, the 
exception handler must determine and remedy the cause of the TEA, or the 603e must be 
reset; therefore, this function should only be used to flag fatal system conditions to the 
processor (such as parity or uncorrectable ECC errors). 

After the 603e has committed to run a transaction, that transaction must eventually 
complete. Address retry causes the transaction to be restarted; TA wait states and DRTRY 
assertion for reads delay termination of individual data beats. Eventually, however, the 
system must either terminate the transaction or assert the TEA signal (and vector the 603e 
into a machine check exception.) For this reason, care must be taken to check for the end 
of physical memory and the location of certain system facilities to avoid memory accesses 
that result in the generation of machine check exceptions. 
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Note that TEA generates a machine check exception depending on the ME bit in the MSR. 
Clearing the machine check exception enable control bits leads to a true checkstop 
condition (instruction execution halted and processor clock stopped). 



8.4.5 Memory Coherency— MEI Protocol 

The 603e provides dedicated hardware to provide memory coherency by snooping bus 
transactions. The address retry capability enforces the three-state, MEI cache-coherency 
protocol (see Figure 8-14). 

The global (GBL) output signal indicates whether the current transaction must be snooped 
by other snooping devices on the bus. Address bus masters assert GBL to indicate that the 
current transaction is a global access (that is, an access to memory shared by more than one 
device). If GBL is not asserted for the transaction, that transaction is not snooped. When 
other devices detect the GBL input asserted, they must respond by snooping the broadcast 
address. 

Normally, GBL reflects the M bit value specified for the memory reference in the 
corresponding translation descriptor(s). Note that care must be taken to minimize the 
number of pages marked as global, because the retry protocol discussed in the previous 
section is used to enforce coherency and can require significant bus bandwidth. 

When the 603e is not the address bus master, GBL is an input. The 603e snoops a 
transaction if TS and GBL are asserted together in the same bus clock cycle (this is a 
qualified snooping condition). No snoop update to the 603e cache occurs if the snooped 
transaction is not marked global. This includes invalidation cycles. 

When the 603e detects a qualified snoop condition, the address associated with the TS is 
compared against the data cache tags. Snooping completes if no hit is detected. If, however, 
the address hits in the cache, the 603e reacts according to the MEI protocol shown in 
Figure 8-14, assuming the WIM bits are set to write-back, caching-allowed, and coherency- 
enforced modes (WIM = 001). 

The 603e's on-chip data cache is implemented as a four-way set-associative cache. To 
facilitate external monitoring of the internal cache tags, the cache set entry (CSEO-CSEl) 
signals indicate which cache set is being replaced on read operations. Note that these 
signals are valid only for 603e burst operations; for all other bus operations, the CSEO- 
CSEl signals should be ignored. 
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INVALID 




BUS TRANSACTIONS 



SH =Snoop Hit 
RH =Read Hit 
WH=WriteHit 
WM=Write Miss 
RM =Read Miss 

SH/CRW=Snoop Hit, Cacheable Read/Write 
SH/CIR =Snoop Hit, Cache Inhibited Read 



(J)= Snoop Push 
(J)= Cache Line Fill 



Figure 8-14. MEI Cache Coherency Protocol— State Diagram (WIM = 001) 

Table 8-9 shows the CSE encodings. 

Table 8-9. CSE0-CSE1 Signals 



CSE0-CSE1 


Cache Set Element 


00 


SetO 


01 


Setl 


10 


Set 2 


11 


Set 3 
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8.5 Timing Exampies 

This section shows timing diagrams for various scenarios. Figure 8-15 illustrates the fastest 
single-beat reads possible for the 603e. This figure shows both minimal latency and 
maximum single-beat throughput. By delaying the data bus tenure, the latency increases, 
but, because of split-transaction pipelining, the overall throughput is not affected unless the 
data bus latency causes the third address tenure to be delayed. 

Note that all bidirectional signals are three-stated between bus tenures. 

BR 
B5 
A66 
T5 
A0-A31 
TT0-TT4 

Test 
Sbl 
SACK 
ARTRY 
BBS. 

BBS 
D0-D63 
TA 
DRTRY 

TEA 

Figure 8-15. Fastest Single-Beat Reads 



I 1 I 2 I 3 I 4 I 5 I 6 I 7 I 8 I 9 I 10 I 11 I 12 



'A. 

"A. 



/ 


\ / ' 


' \ 7 ' 


\ ! 










; 




i 






/ 




ZY_ 


_jz: 





JZL 


i 







Y 



T 



- ( CPUA Y 

- ( Read ) ~ 



Y. 

-t-c 



T 



-/ CPUA W 

'—7— \ ' 

‘ Read ‘ 



f 



\ [ 



^ CPU y 
Read 



j~f 



J~^ 

>-! — H: 



>4 



I 1 I 2 I 3 I 4 I 5 1 6 I 7 I 8 I 9 I 10 I 11 I 12 



8-32 



PowerPC 603e RISC Microprocessor User's Manual 






Figure 8-16 illustrates the fastest single-beat writes supported by the 603e. All bidirectional 
signals are three-stated between bus tenures. 
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Figure 8-17 shows three ways to delay single-beat reads showing data-delay controls: 

• The TA signal can remain negated to insert wait states in clock cybles 3 and 4. 

• For the second access, DBG could have been asserted in clock cycle 6. 

• In the third access, DRTRY is asserted in clock cycle 1 1 to flush the previous data. 

Note that all bidirectional signals are three-stated between bus tenures. The pipelining 
shown in Figure 8-17 can occur if the second access is not another load (for example, an 
instruction fetch). 




i I I I I I I I I I I I I I I 
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Figure 8-17, Single-Beat Reads Showing Data-Delay Controls 
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Figure 8-18 shows data-delay controls in a single-beat write operation. Note that all 
bidirectional signals are three-stated between bus tenures. Data transfers are delayed in the 
following ways: 

• The TA signal is held negated to insert wait states in clocks 3 and 4. 

• In clock 6, DBG is held negated, delaying the start of the data tenure. 

The last access is not delayed (DRTRY is valid only for read operations). 
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Figure 8-18. Single-Beat Writes Showing Data Delay Controls 
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Figure 8-19 shows the use of data-delay controls with burst transfers. Note that all 
bidirectional signals are three-stated between bus tenures. Note the following: 

• The first data beat of bursted read data (clock 0) is the critical quad word. 

• The write burst shows the use of TA signal negation to delay the third data beat. 

• The final read burst shows the use of DRTRY on the third data beat. 

• The address for the third transfer is delayed until the first transfer completes. 
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Figure 8-19. Burst Transfers with Data Deiay Controis 
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Figure 8-20 shows the use of the TEA signal. Note that all bidirectional signals are three- 
stated between bus tenures. Note the following: 

• The first data beat of the read burst (in clock 0) is the critical quad word. 

• The TEA signal truncates the burst write transfer on the third data beat. 

• The 603e eventually causes an exception to be taken on the TEA event. 
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Figure 8-20. Use of Transfer Error Acknowledge (TEA) 
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8.6 Optional Bus Configurations 

The 603e supports three optional bus configurations that are selected by the assertion or 
negation of DRTRY, TLBISYNC, and QACK signals during the negation of the HRESET 
signal. The operation and selection of the optional bus configurations are described in the 
following sections. 

8.6.1 32-Bit Data Bus Mode 

The 603e supports an optional 32-bit data bus mode. The 32-bit data bus mode operates the 
same as the 64-bit data bus mode with the exception of the byte lanes involved in the 
transfer and the number of data beats that are performed. When in 32-bit data bus mode, 
only byte lanes 0 through 3 are used corresponding to DHO-DH31 and DP0-DP3. Byte 
lanes 4 through 7 corresponding to DL0-DL31 and DP4-DP7 are never used in this mode. 
The unused data bus signals are not sampled by the 603e during read operations, and they 
are driven low during write operations. 

The number of data beats required for a data tenure in the 32-bit data bus mode is one, two, 
or eight beats depending on the size of the program transaction and the cache mode for the 
address. Data transactions of one or two data beats are performed for caching-inhibited 
load/store or write-through store operations. These transactions do not assert the TBST 
signal even though a two-beat burst may be performed (having the same TBST and 
TSIZ0-TSIZ2 encodings as the 64-bit data bus mode). Single-beat data transactions are 
performed for bus operations of 4 bytes or less, and double-beat data transactions are 
performed for 8-byte operations only. The 603e only generates an 8-byte operation for a 
double- word-aligned load or store double operation to or from the floating-point GPRs. All 
cache-inhibited instruction fetches are performed as word (single-beat) operations. 

Data transactions of eight data beats are performed for burst operations that load into or 
store from the 603e’s internal caches. The se trans actions transfer 32 bytes in the same way 
as in 64-bit data bus mode, asserting the TBST signal, and signaling a transfer size of 2 
(TSIZ(0-2) = ObOlO). 

The same bus protocols apply for arbitration, transfer, and termination of the address and 
data tenu res in the 32-bit data bus mode as they apply to the 64-bit data bus mode. Late 
ARTRY cancellation of the data tenure applies on the bus clock after the first data beat is 
acknowledged (after the first TA) for word or smaller transactions, or on the bus clock after 
the second data beat is acknowledged (after the second T A) for double-word or burst 
operations (or coincident with respective TA if no-DRTRY mode is selected). 

An example of an eight-beat data transfer while the 603e is in 32-bit data bus mode is 
shown in Figure 8-21. 
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Figure 8-21. 32-Bit Data Bus Transfer (Eight-Beat Burst) 



An example of a two-beat data transfer (with DRTRY asserted during each data tenure) is 
shown in Figure 8-22. 




The 603e se lects 64-bit or 32-bit da ta bus mo de at startup by sa mpling the state of the 
TLBISYNC signal at the negation of HRESET. If the TLBISYNC signal is negated at the 
negation of HR ESET, 64- bit data mode is entered by the 603e. If TLBISYNC is asserted at 
the negation of HRESET, 32-bit data mode is entered. 
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8.6.2 No-DRTRYMode 

The 603e supports an optional mode to disable the use of the data retry function provided 
through the DRTRY signal. The no-DRTRY mode allows the forwarding of data during 
load operations to the internal CPU one bus cycle sooner than in the normal bus protocol. 

The PowerPC bus protocol specifies that, during load operations, the memory system 
normally has the capability to cancel data that was read by the master on the bus cycle after 
TA was asserted. In the 603e implementation, this late cancellation protocol requires the 
603e to hold any loaded data at the bus interface for one additional bus clock to verify that 
the data is valid before forwarding it to the internal CPU. For systems that do not implement 
the DRTRY function, the 603e provides an optional no-DRTRY mode that eliminates this 
one-cycle stall during all load operations, and allows for the forwarding of data to the 
internal CPU immediately when TA is recognized. 

When the 603e is in the no-DRTRY mode, data can no longer be cancelled the cycle after 
it is acknowledged by an assertion of TA. Data is immediately forwarded to the CPU 
internally, and any attempt at late cancellation by the system may cause improper operation 
by the 603e. 

When the 603e is following normal bus protocol, data may be cancelled the bus c ycle after 
TA by eit her of two means — late cancellation by DRTRY, or late cancellation by ARTRY. 
When no-DRTRY mode is selected, both cancellation cases must be disallowed in the 
system design for the bus protocol. 

When no-DRTRY mode is selected for the 603e, the system must ensure that DRTRY will 
not be asserted to the 603e. If it is asserted, it may cause improper operation of the bus 
interface. The system must also ensure that an assertion of ARTRY by a snooping device 
must occur before or coincident with the first assertion of TA to the 603e, but not on the 
cycle after the first assertion of TA. 

Other than the inability to cancel data that was read by the master on the bus cycle after TA 
was asserted, the bus protocol for the 603e is identical to that for the basic transfer bus 
protocols described in this chapter, as well as for 32-bit data bus mode. 

The 603e selects the desired DRTRY mode at startup by sampling the state of the DRTRY 
signal itself at the neg ation of the HRESET signal. If the DRTRY signal is negated at the 
negation of HRESET, no rmal oper ation is selected. If the DRTRY signal is asserted at the 
negation of HRESET, no-DRTRY mode is selected. 

8.6.3 Reduced-Pinout Mode 

The 603e provides an optional reduced-pinout mode. This mode idles the switching of 
numerous s igna ls for re duced power consumption. The DL0-DL31, DP0-DP7, AP0-AP3, 
APE, DPE, and RSRV signals are disabled when the reduced-pinout mode is selected. Note 
that the 32-bit data bus mode is implicitly selected when the reduced-pinout mode is 
enabled. 
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When in the reduced-pinout mode, the bidirectional and output signals disabled are always 
driven low during the periods when they would normally have been driven by the 603e. The 
open-drain outputs (APE and DPE) are always three-stated. The bidirectional inputs are 
always turned-off at the input receivers of the 603e and are not sampled. 

The 603e selects either full-pinout or red uced-pino ut mod e at startu p by sampling the state 
of the QACK signal at the negation of HRESET. If the QACK signal i s asserted at the 
negation of HRESET, fiill-pinout mode is selected by the 603e. If QACK is negated at the 
negation of HRESET, reduced-pinout mode is selected. 

8.7 Interrupt, Checkstop, and Reset Signals 

This section describes external interrupts, checkstop operations, and hard and soft reset 
inputs. 

8.7.1 External interrupts 

The external interrupt input signals (INT, SMI and MCP) of the 603e eventually force the 
processor to take the external interrupt vector, or the system management interrupt vector 
if the MSR[EE] is set, or the machine check interrupt if the MSR[ME] bit and the 
HID0[EMCP] bit are set. 

8.7.2 Checkstops 

The 603e has two checkstop input signals — CKSTPJN (non-maskable) and MCP (enabled 
when MSR[ME] is cleared, and HID0[EMCP] is set), and a checkstop output 
(CKSTP_OUT). If CKSTPJN or MCP is asserted, the 603e halts operations by gating off 
all internal clocks. The 603e asserts CKSTP_OUT if CKSTPJN is asserted. 

If CHECKSTOP is asserted by the 603e, it has entered the checkstop state, and processing 
has halted internally. The CHECKSTOP signal can be asserted for various reasons 
including receiving a TEA signal and detection of external parity errors. For more 
information about checkstop state, see Section 4.S.2.2, “Checkstop State (MSR[ME] = 0).” 

8.7.3 Reset Inputs 

The 603e has two reset inputs, described as follows: 

• HRESET (hard reset) — ^The HRESET signal is used for power-on reset sequences, 
or for situations in which the 603e must go through the entire cold-start sequence of 
internal hardware initializations. 

• SRESET (soft reset) — ^The soft reset input provides warm reset capability. This 
input can be used to avoid forcing the 603e to complete the cold start sequence. 

When either reset input is negated, the processor attempts to fetch code from the system 
reset exception vector. The vector is located at offset 0x00100 from the exception prefix (all 
zeros or ones, depending on the setting of the exception prefix bit in the machine state 
register (MSR[IP]). The IP bit is set for HRESET. 
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8.7.4 System Quiesce Control Signals 

The system quiesce control signals (QREQ and QACK) allow the processor to enter a low 
power state, and bring bus activity to a quiescent state in an orderly fashion. 

The system quiesce state is entered by asserting the QREQ signal. This signal allows the 
system to terminate or pause any bus activities that are normally snooped. When the system 
is ready to enter the system quiesce state, it asserts the QACK signal. At this time the 603e 
may enter a quiescent (low power) state. When the 603e is in the quiescent state, it stops 
snooping bus activity. 

8.8 Processor State Signals 

This section describes the 603e's support for atomic update and memory through the use of 
the Iwarx/stwcx. opcode pair, and includes a description of the 603e TLBISYNC input. 

8.8.1 Support for the Iwarx/stwcx. Instruction Pair 

The Load Word and Reserve Indexed (Iwarx) and the Store Word Conditional Indexed 
(stwcx.) instructions provide a means for atomic memory updating. Memory can be 
updated atomically by setting a reservation on the load and checking that the reservation is 
still valid before the store is performed. In the 603e, the reservations are made on behalf of 
aligned, 32-byte sections of the memory address space. 

The reservation (RSRV) output signal is driven synchronously with the bus clock and 
reflects the status of the reservation coherency bit in the reservation address register (see 
Chapter 3, “Instrpction and Data Cache Operation,” for more information). See 
Section 7.2.9.7.3, “Reservation (RSRV)— Output,” for information about timing. 



8.8.2 TLBISYNC Input 

The TLBISYNC input allows for the hardware synchronization of changes to MMU tables 
when the 603e and another DMA master share the same MMU translation tables in system 
memory. It is asserted by a DMA master when it is using shared addresses that could be 
changed in the MMU tables by the 603e during the DMA master’s tenure. 

The TLBISYNC input, when asserted to the 603e, prevents the 603e from completing any 
instructions past a tlbsync instruction. Generally, during the execution of an eciwx or 
ecowx instr uction by the 603e, the selected DMA device should assert the 603e’s 
TLBISYNC signal and maintain it asserted during its DMA tenure if it is using a shared 
translation address. Subsequent instructions by the 603e should include a sync and tlbsync 
instruction before any MMU table changes are performed. This will prevent the 603e from 
making table changes disruptive to the other master during the DMA period. 
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8.9 IEEE 11 49.1 -Compliant Interface 

The 603e boundary-scan interface is a fully-compliant implementation of the IEEE 1149.1 
standard. This section describes the 603e IEEE 1149.1(JTAG) interface. 

8.9.1 IEEE 1149.1 Interface Description 

The 603e has five dedicated JTAG signals which are described in Table 8-10. The TDI and 
TDO scan ports are used to scan instructions as well as data into the various scan registers 
for JTAG operations. The scan operation is controlled by the test access port (TAP) 
controller which in turn is controlled by the TMS input sequence. The scan data is latched 
in at the rising edge of TCK. 



Table 8-10. IEEE Interface Pin Descriptions 



Signal Name 


Input/Output 


Weak Pullup 
Provided 


IEEE 1149.1 Function 


TDI 


Input 


Yes 


Serial scan input signal 


TDO 


Output 


No 


Serial scan output signal 


TMS 


Input 


Yes 


TAP controller mode signal 


TCK 


Input 


Yes 


Scan clock 


TRST 


Input 


Yes 


TAP controller reset 



TRS T is a JTAG optional signal which is used to reset the TAP controller asynchronously. 
The TRST signal assures that the JTAG logic does not interfere with the normal operation 
of the chip, and can be asserted coincident with the HRESET. 

8.10 Using Data Bus Write Only 

The 603e supports split-transaction pipelined transactions. It supports a limited out-of- 
order capability for its own pipelined transactions thr ough the data bus writ e only (D BWO) 
signal. When recognized on the clock of a qualified DBG, the assertion of DBWO directs 
the 603e to perform the next pending data write tenure (if any), even if a pendi ng read 
tenure would have normally been performed because of address pipelining. The DBWO 
signal does not change the order of write tenures with respect to other write tenures from 
the same 603e. It only allows that a write tenure be performed ahead of a pending read 
tenure from the same 603e. 

In general, an address tenure on the bus is followed strictly in order by its associated data 
tenure. Transactions pipelined by the 603e complete strictly in order. However, the 603e 
can run bus transactions out of order only when the external system allows the 603e to 
perform a cache-line-snoop-push-out operation (or other write transaction, if pending in the 
603e write queues) between the address and data tenures of a read operation through the 
use of DBWO. This effectively envelopes the write operation within the read operation. 
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Figure 8-23 shows how the DBWO signal is used to perform an enveloped write 
transaction. 



BS 

7k55 




DBB 

DBWO 




Figure 8-23. Data Bus Write Oniy Transaction 

Note that although the 603e can pipeline any write transaction behind the read transaction, 
special care should be used when using the enveloped write feature. It is envisio ned that 
most system implementations will not need this capability; for these a pplication s, DBWO 
should remain negated. In systems where this capability is needed, DBWO should be 
asserted under the following scenario: 

1 . The 603e initiates a read transaction (either single-beat or burst) by completing the 
read address tenure with no address retry. 

2. Then, the 603e initiates a write transaction by completing the write address tenure, 
with no address retry. 

3. At this point , if DB WO is asserted with a qualihed data bus grant to the 603e, the 
603e asserts DBB and drives the write data onto the data bus, out of order with 
respect t o the a ddress pipeline. The write transaction concludes with the 603e 
negating DBB. 

4. The next qualified data bus grant signals the 603e to complet e the o utstanding read 
transaction by latching the d ata on th e bus. This assertion of DBG should not be 
accompanied by an asserted DBWO. 

Any number of bus transactions by other bus masters can be attempted between any of 
these steps. 
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Note the following regarding DBWO: 



• DBWO can be asserted if no data bus read is pending, but it has no effect on write 
ordering. 

• The ordering and presence of data bus writes is determined b y the writes in the write 
queues at the time BG is asserted for the write address (not DBG). If a particular 
write is desired (for example, a cache-line-snoop-push-out operation), then BG must 
be asserted after that particular write is in the queue and it must be the highest 
priority write in the queue at that time. A cache-line-snoop-push-out operations may 
be the highest priority write, but more than one may be queued. 

• Because more than one write may be in the write queue when DBG is asserted for 
the write address, more than one data bus write may be enveloped by a pending data 
bus read. 

The arbiter must monitor bus operations and coordinate the various masters and slaves with 
respect to the use of the data bus when DBWO is used. Individual DBG signals associated 
with each bus device should allow the arbiter to synchronize both pipelined and split- 
transaction bus organizations. Individual DBG and DBWO signals provide a primitive 
form of source-level tagging for the granting of the data bus. 

Note that use of the DBWO signal allows some operation-level tagging with respect to the 
603e and the use of the data bus. 
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Chapter 9 

Power Management 

The PowerPC 603e microprocessor is the first microprocessor specifically designed for 
low-power operation. The 603e provides both automatic and program-controllable power 
reduction modes for progressive reduction of power consumption. This chapter describes 
the hardware support provided by the 603e for power management. 

9.1 Dynamic Power Management 

Dynamic power management automatically powers up and down the individual execution 
units of the 603e, based upon the contents of the instruction stream. For example, if no 
floating-point instructions are being executed, the floating-point unit is automatically 
powered down. Power is not actually removed from the execution unit; instead, each 
execution unit has an independent clock input, which is automatically controlled on a 
clock-by-clock basis. Since CMOS circuits consume negligible power when they are not 
switching, stopping the clock to an execution unit effectively eliminates its power 
consumption. The operation of DPM is completely transparent to software or any external 
hardware. D ynamic po wer management is enabled by setting bit 11 in HIDO on power-up, 
or following HRESET. 

9.2 Programmable Power Modes 

The 603e provides four programmable power states — ^full power, doze, nap, and sleep. 
Software selects these modes by setting one (and only one) of the three power saving mode 
bits. Hardware can enable a power management state through external asynchronous 
interrupts. The hardware interrupt causes the transfer of program flow to interrupt handler 
code. The appropriate mode is then set by the software. The 603e provides a separate 
interrupt and interrupt vector for power management — the system management interrupt 
(SMI). The 603e also contains a decrement timer which allows it to enter the nap or doze 
mode for a predetermined amount of time and then return to full power operation through 
the decrementer interrupt (DI). Note that the 603e cannot switch from one power 
management mode to another without first returning to full on mode. The nap and sleep 
modes disable bus snooping; therefore, a hardware handshake is provided to ensure 
coherency before the 603e enters these power management modes. Table 9-1 summarizes 
the four power states. 
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Table 9-1 . PowerPC 603e Microprocessor Programmable Power Modes 



PM Mode 


Functioning Units 


Activation Method 


Full-Power Wake Up 
Method 


Full power 


All units active 


— 


— 


Full power 
(with DPM) 


Requested logic by 
demand 


By instruction dispatch 


— 


Doze 


• Bus snooping 

• Data cache as needed 

• Decrementer timer 


Controlled by SW 


External asynchronous 
exceptions* 
Decrementer interrupt 
Reset 


Nap 


Decrementer timer 


Controlled by hardware 
and software 


External asynchronous 
exceptions 

Decrementer interrupt 
Reset 


Sleep 


None 


Controlled by hardware 
and software 


External asynchronous 

exceptions 

Reset 



* Exceptions are referred to as interrupts in the architecture specification. 



9.2.1 Power Management Modes 

The following sections describe the characteristics of the 603e’s power management 
modes, the requirements for entering and exiting the various modes, and the system 
capabilities provided by the 603e while the power management modes are active. 

9.2.1 .1 Full-Power Mode with DPM Disabled 

Full-power mode with DPM disabled power mode is selected when the DPM enable bit (bit 
1 1) in HIDO is cleared. 

• Default state following power-up and HRESET 

• All functional units are operating at full processor speed at all times 

9.2.1 .2 Full-Power Mode with DPM Enabled 

Full-power mode with DPM enabled (HID0[11] = 1) provides on-chip power management 
without affecting the functionality or performance of the 603e. 

• Required functional units are operating at full processor speed 

• Functional units are clocked only when needed 

• No software or hardware intervention required after mode is set 

• Software/hardware and performance transparent 
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9.2.1 .3 Doze Mode 

Doze mode disables most functional units but maintains cache coherency by enabling the 
bus interface unit and snooping. A snoop hit will cause the 603e to enable the data cache, 
copy the data back to memory, disable the cache, and fully return to the doze state. 

• Most functional units disabled 

• Bus snooping and time base/decrementer still enabled 

• Doze mode sequence 

— Set doze bit (HID0[8] = 1) 

— 603e enters doze mode after several processor clocks 

• Several methods of returning to full-power mode 
— Assert INT, SMI, MCP or decrementer interrupts 
— Assert hard reset or soft reset 

• Transition to full-power state takes no more than a few processor cycles 

• PLL running and locked to SYSCLK 

9.2.1. 4 Nap Mode 

The nap mode disables the 603e but still maintains the phase locked loop (PLL) and the 
time base/decrementer. The time base can be used to restore the 603e to full-on state after 
a programmed amount of time. Because bus snooping is disabled for nap and sleep mode, 
a hardware handshake using the quiesce request (QREQ) and quiesce acknowledge 
(QACK) signals are required to maintain data coherency. The 603e will assert the QREQ 
signal to indicate that it is ready to disable bus snooping. When the system has ensured that 
snooping is no longer necessary, it will assert QACK and the 603e will enter the sleep or 
nap mode. 

• Time base/decrementer still enabled 

• Most functional units disabled (including bus snooping) 

• All nonessential input receivers disabled 

• Nap mode sequence 

— Set nap bit (HID0[9] = 1) 

— 603e asserts quiesce request (QREQ) signal 

— System asserts quiesce acknowledge (QACK) signal 

— 603e enters sleep mode after several processor clocks 

• Several methods of returning to full-power mode 
— Assert INT, SMI, MCP or decrementer interrupts 
— Assert hard reset or soft reset 

• Transition to full-power takes no more than a few processor cycles 

• PLL running and locked to SYSCLK 
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9.2.1. 5 Sleep Mode 

Sleep mode consumes the least amount of power of the four modes since all functional units 
are disabled. To conserve the maximum amount of power, the PLL may be disabled and the 
SYSCLK may be removed. Due to the fully static design of the 603e, internal processor 
state is preserved when no internal clock is present. Because the time base and decrementer 
are disabled while the 603e is in sleep mode, the 603e’s time base contents will have to be 
updated from an external time base following sleep mode if accurate time-of-day 
mainten ance is required. Before the 603e enters the sleep mode, the 603e will assert the 
QREQ signal to indicate that it is ready to disable bus snooping. When the system has 
ensured that snooping is no longer necessary, it will assert QACK and the 603e will enter 
the sleep mode. 

• All functional units disabled (including bus snooping and time base) 

• All nonessential input receivers disabled 
— Internal clock regenerators disabled 
— PLL still running (see below) 

• Sleep mode sequence 

— Set sleep bit (HID0[10] = 1) 

— 603e asserts quiesce request (QREQ) 

— System asserts quiesce acknowledge (QACK) 

— 603e enters sleep mode after several processor clocks 

• Several methods of returning to fiill-power mode 
— Assert INT, SMI or MCP intemipts 

— Assert hard reset or soft reset 

• PLL may be disabled and SYSCLK may be removed while in sleep mode 

• Return to full-power mode after PLL and SYSCLK disabled in sleep mode 
— Enable SYSCLK 

— Reconfigure PLL into desired processor clock mode 
— System logic waits for PLL startup and relock time (100 psec) 

— System logic asserts one of the sleep recovery signals (for example, INT or SMI) 

9.2.2 Power Management Software Considerations 

Since the 603e is a dual issue processor with out-of-order execution capability, care must 
be taken in how the power management mode is entered. Furthermore, nap and sleep modes 
require all outstanding bus operations to be completed before the power management mode 
is entered. Normally during system configuration time, one of the power management 
modes would be selected by setting the appropriate HIDO mode bit. Later on, the power 
management mode is invoked by setting the MSR[POW] bit. To provide a clean transition 
into and out of the power management mode, the mtmsr[PGW] should be preceded by a 
sync instruction and followed by an isync instruction. 
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Appendix A 

PowerPC Instruction Set Listings 



This appendix lists the PowerPC 603e microprocessor’s instruction set as well as the 
additional PowerPC instructions not implemented in the 603e. Instructions are sorted by 
mnemonic, opcode, function, and form. Also included in this appendix is a quick reference 
table that contains general information, such as the architecture level, privilege level, and 
form, and indicates if the instruction is 64-bit and optional. 

Note that split fields, that represent the concatenation of sequences from left to right, are 
shown in lowercase. For more information refer to Chapter 8, “Instruction Set,” in The 
Programming Environments Manual. 

A.1 Instructions Sorted by Mnemonic 

Table A- 1 lists the instructions implemented in the PowerPC architecture in alphabetical 
order by mnemonic. 



Table A-1. Complete Instruction List Sorted by Mnemonic 



Key: 

I [ Reserved bits 



fmi 



Instruction not implemented in the 603e 



Name 


0 


6 7 8 9 10 


11 12 13 14 15 


16 17 18 19 20 21 


22 23 24 25 26 27 28 29 30 


31 


addx 


31 


D 


A 


B 


OE 


266 


Rc 


addcx 


31 


D 


A 


B 


OE 


10 


Rc 


addex 


31 


D 


A 


B 


OE 


138 


Rc 


addi 


14 


D 


A 


SIMM 


addic 


12 


D 


A 


SIMM 


addic. 


13 


D 


A 


SIMM 


addis 


15 


D 


A 


SIMM 


addmex 


31 


D 


A 


00000 


OE 


234 


Rc 


addzex 


31 


D 


A 


ooooo 


OE 


202 


Rc 


andx 


31 


S 

i 


A 


B 


28 


Rc 


andcx 


31 


S 


A 


B 


60 


Rc 
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Name 


0 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


and!. 

andis. 

bx 

bcx 

bcctrx 

bcirx 

cmp 

cmpi 

cmpi 

' 

cmpli 


28 


S 


A 


UIMM 


29 


s 


A 


UIMM 


18 


LI 


AA 


LK 


16 


BO 


Bi 


BD 


AA 


LK 


19 


BO 


Bl 


00000 


528 


LK 


19 


BO 


BI 


00000 


16 


LK 


31 


crfD 


D 


L 


A 


B 


0 




ll 


11 


crfD 


0 


L 


A 


SIMM 


31 


crfD 


0 


L 


A 


B 


32 


II 


10 


CrfD 


0 


L 


A 




UIMM 










wm 

m 










cntizwx 

crand 

crandc 

creqv 

crnand 

crnor 

cror 

crorc 

crxor 

debt 

debP 

debst 

debt 

debts! 

debz 


31 


S 


A 


00000 


26 


Rc 


19 




crbA 


crbB 


257 


0 


19 


crbD 


crbA 


crbB 


129 


0 


19 


crbD 


crbA 


crbB 


289 


0 


19 


crbD 


crbA 


crbB 


225 


i 


19 


crbD 


crbA 


crbB 


33 


Q 


19 


crbD 


crbA 


crbB 


449 


0 


19 


crbD 


crbA 


crbB 


417 


0 


19 


crbD 


crbA 


crbB 


193 




31 


00000 


A 


B 


86 


i 


31 


00000 


A 


B 


470 


0 


31 


00000 


A 


B 


54 


0 


31 


00000 


A 


B 

. 


278 


0 


31 


00000 


A 


B 


246 


0 


31 


00000 


A 


B 


1014 


0 






5 ; ' ;S. 












Be 

1^ 


divwx 

divwux 

eeiwx 

eeowx 

eieio 


31 


D 


A 


B 


OE 


491 


m 


31 


D 


A 


B 


OE 


459 


Rc 


31 


D 


A 


B 


310 


M 


31 


S 


A 


B 


438 


mm 

H 


31 


00000 


00000 


00000 


854 


s 
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Name 0 



6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



eqvx 


31 


S 


A 


B 


284 


Rc 


extsbx 


31 


S 


A 


00000 


954 


Rc 


extshx 


31 


S 




A 


00000 


922 




Rc 


extswx^ 


31 


s 




A 


00000 










fabsx 


63 


D 


00000 


B 


264 






faddx 

faddsx 


63 

59 


D 

D 


A 

A 


B 

B 


00000 

00000 


- 


21 

21 


IQ 

IQ 

1 


tempo 


63 


crfD 


00 


A 


B 


32 


i 


fempu 


63 


crfD 


00 


A 


B 


C 






i 






0 




- — r— 


■ ■ 




1 

i 


fetiwx 


63 


D 




00000 


B 


u ^ 




fctiwzx 


63 


D 


00000 


B 


15 




fdivx 


63 


D 


A 


B 


00000 


18 


Rc 


fdivsx 


59 


D 


A 


B 


00000 


18 


^Q 


fmaddx 


63 


D 


A 


B 


c 


29 


Rc 


fmaddsx 


59 


D 


1 

A 1 




c 


29 


^Q 


fmrx 


63 


D 


00000 1 


B 


72 


Rc 


fmsubx 


63 


D 


A i 


B j 


C 


28 




fmsubsx 


59 


D 


A 


B 


C 


28 


Rc 


fmulx 


63 


D 


i 

A 


00000 


C 


25 


Rc 


fmulsx 


59 


D 


A 


00000 


C 


25 


Rc 


fnabsx 


63 


D 


00000 


B 


136 


Rc 


fnegx 


63 


D 


00000 


B 


40 


Rc 


fnmaddx 


63 


D 


A 


B 


C 


31 


Rc 


fnmaddsx 


59 


D 


A 


B 


C 


31 




fnmsubx 


63 


D 


A 


B 


C 


30 


Rc 


fnmsubsx 


59 


D 


A 


B 


C 


30 


Rc 


fresx® 


59 


D 


00000 


B 


00000 


24 


Rc 


frspx 


63 


D 


00000 


B 


12 


Rc 


frsqrtex® 


63 


D 


00000 


B 


00000 


26 


Rc 


f seix ® 


63 


D 


A 


B 


C 


23 


Rc 
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6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
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Name 0 



6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



Iswi ^ 
Iswx^ 


31 


D 


A 


NB 


597 


0 


31 


D 


A 




B 






533 


0 


58 


D 










Iwarx 

Iwbrx 
iwz 
Iwzu 
iwzux 
iwzx 
mcrf 
mcrfs 
mcrxr 
mfcr 
mffsx 
mfmsr ^ 
mfspr ^ 
mfsr ^ 
mfsrin ^ 
mftb 
mtcrf 
mtfsbOx 
mtfsbix 
mtfsfx 
mtfsfix 
mtmsr ^ 
mtspr ^ 
mtsr ^ 
mtsrin ^ 

muihwx 


31 

31 


D 

D 


A 

A 


B 

B 




20 


I 

i 

i 


534 


B 


32 


D 


A 


d 


33 


D 


A 


d 


31 


D 


A 


B 


55 


B 


31 


D 


A 


B 


23 


1 


19 


crfD 


III! 


crfS 


00 


00000 


0 


B 


63 


crfD 


Illl 


crfS 


00 


00000 


64 


B 


31 


crfD 


nil 


00000 


00000 


512 


1 


31 


D 


00000 


00000 


19 


B 


63 


D 


00000 


00000 


583 




31 


D 


00000 


00000 


83 


0 


31 


D 


spr 


339 


B 


31 


D 


0 


SR 


00000 


595 


0 


31 


D 


00000 


B 


659 


0 


31 


D 


tbr 


371 


0 


31 


S 


0 


CRM 


0 


144 


B 


63 


crbD 


00000 


00000 


70 




63 


crbD 


00000 


00000 


38 




63 


o 

u. 

IHI 


B 


711 


Rc 


63 


crfD 


00 


00000 


IMM 


0 


134 




31 


S 


00000 


00000 


146 


B 


31 


S 


spr 


467 


B 


31 


S 


r° 


SR 


00000 


210 


1 


31 

* ^ . 'i. 'i ' 


S 

' o' 




00000 


B 




242 


B 




' -A 


- 


— “ — 






s 


31 


D 


A 


B 


B 


75 


Rc 
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Name 0 



6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



stbu 

stbux 

stbx 


39 


S 


A 


d 


31 

31 


S 

S 


A 

A 




247 

215 

' I ^ 


0 

0 


stfdu 
stfdux 
stfdx 
stf iwx ® 
stfs 
stfsu 
stfsux 
stfsx 
sth 
sthbrx 
sthu 
sthux 
sthx 
stmw ^ 
stswi ^ 

StSWX 

stw 

stwbrx 

stwcx. 

stwu 

stwux 

stwx 

subfx 

subfcx 




S 


A 






55 


S 


A 


d 


31 


S 


A 


B 


759 


0 


31 


S 


A 


B 


727 


iii 


31 


S 


A 


B 


983 


II 


52 


S 


A 


d 


53 


S 


A 


d 


31 


S 


A 


B 


695 


0 


31 


S 


A 


B 


663 


0 


44 


S 


A 


d 


31 


S 


A 


B 


918 


0 


45 


S 


A 

.. , I 


d 


31 


s 


A 


B 


439 


0 


31 


s 


A 


B 


407 




0 


47 


s 


A 


d 


31 


s 


A 


NB 


725 


0 


31 


s 


A 


B 


661 


0 


36 


s 


A 


d 


31 


s 


A 


B 


662 


0 


31 


s 


A 


B 


150 


1 


37 


s 


A I 


d 


31 


s 


A ! 


B 


183 


0 


31 


s 


A 


B 


151 


0 


31 


D 


A 


B 


OE 


40 


Rc 


31 


D 


A 


B 


s 


8 
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Name 


0 




6 7 8 9 10 


11 12 13 14 15 


16 17 18 19 20 21 


22 23 24 25 26 27 28 29 30 31 


subfex 


31 


D 


A 


B 


OE 


136 


Rc 


subtle 


08 


D 


A 


SIMM 


subfmex 


31 


D 


A 


00000 


OE 


232 


Rc 


subfzex 


31 


D 


A 


00000 


OE 


200 


Rc 


sync 


1 31 




00000 


00000 


00000 


598 

^ 


0 


■ 




z 








1 


iSStz 


§ 


tibie 


31 




00000 


00000 


B 




306 


0 


tibid ’ ® 


31 


00000 


00000 


B 


978 


0 


tibli ’ ® 


31 


00000 


00000 


B 


1010 


0 


tlbsync^’® 


31 


00000 


00000 


00000 


566 


0 


tw 


31 


TO 


A 


B 


4 


0 


twi 


03 


TO 


A 


SIMM 


xorx 


31 


S 


A 


B 


316 


Rc 


xori 


26 


S 


A 


UIMM 


xoris 


27 


S 


A 


UIMM 



^ Supervisor-level Instruction 
^ Supervisor* and user-level Instruction 
^ Load and store string or multiple Instruction 
^ 64-blt instruction 

® Optional In the PowerPC architecture 
® 603e-implementatlon specific Instruction 
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A.2 Instructions Sorted by Opcode 

Table A-2 lists the instructions defined in the PowerPC architecture in numeric order by 
opcode. 



Key: 

[ I Reserved bits 




Instruction not implemented in the 603e 



Table A-2. Complete Instruction List Sorted by Opcode 



Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 







TO 






twi 

muili 

subtle 

empli 

empi 

addic 

addic. 

addi 

addis 

bex 

sc 

bx 

merf 

bcirx 

ernor 

rfi 

crandc 

isync 

crxor 

ernand 

crand 

creqv 

crorc 

cror 

bcctrx 

rlwimix 


00001 1 


TO 


A 


SIMM 


0001 1 1 


D 


A 


SIMM 


001 000 


D 


A 


SIMM 


001010 


crfD 


n 


L 


A 


UIMM 


001011 


crfD 


il 


L 


A 


SIMM 


001100 


D 


A 


SIMM 


001101 


D 


A 


SIMM 


001110 


D 


A 


SIMM 


001111 


D 


A 


SIMM 


010000 


BO 


Bl 


BD 


AA 


LK 


01 0001 


00000 


00000 


000000000000000 


1 


o; 


010010 


LI 


AA 


LK 


010011 


crfD 


00 


erfS 0 0 


00000 


0000000000 


0 


010011 


BO 


Bl 


00000 


000001 0000 


LK 


010011 


crbD 


crbA 


crbB 


00001 00001 


0 


010011 


00000 


00000 


00000 


00001 10010 


0 


010011 


crbD 


crbA 


crbB 


001 0000001 


0 


010011 


00000 


00000 


00000 


0010010110 


0 


010011 


crbD 


crbA 


crbB 


001 1 000001 


0 


010011 


crbD 


crbA 


crbB 


00111 00001 


0 


010011 


crbD 


crbA 


crbB 


01 00000001 


0 


010011 


crbD 


crbA 


crbB 


01001 00001 


0 


010011 


crbD 


crbA 


crbB 


01101 00001 


0 


010011 


crbD 


crbA 


crbB 


0111 000001 


0 


010011 


BO 


Bl 


00000 


1 00001 0000 


LK 


010100 


S 


A 


SH 


MB ME 


Rc 
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cmp 


011111 




i 


i 


A 


B 


0000000000 


0 


tw 


011111 


TO 


A 


B 


00000001 00 


II 


subfcx 


011111 


D 


A 


B 


s 


0000001000 






011111 


U 




S 


i 


OOOOOO 1001 




addcx 


011111 


D 


A 


B 


s 


0000001010 


S3 


mulhwux 


01 1111 


D 


A 


B 


0 


0000001011 




mfcr 


IIIQQQQQQIIII 


D 






0000010011 


oi 


Iwarx 




D 


A 


B 


0000010100 


o" 







dcbst 

Iwzux 
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Name 0 



5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 









A 




0000111010 


g 


andcx 


011111 


s 


A 


B 


0000111100 




I imrthdx* 




■■ ' ' ' /d , ^ ' 


A ' 

" k'' ' ' 


" '/ ^ ' 


- 

0 


0001001001 


i 


mulhwx 


011111 


D 


A 


B 


0 


000 1 001 01 1 




mfmsr 


011111 


D 






0001 010011 


Q 




oilin': 




A 

^ ' 


;s 


' 


' 0001010100 ' 


i 


dcbf 


011111 


00000 


A 


B 


0001 010110 


1 


Ibzx 


011111 


D 


A 


B 


0001 010111 


o: 


negx 


011111 


D 


■nH 






0001 1 01 000 




Ibzux 


011111 


D 


A 


B 


0001 110111 




norx 


011111 


S 


A 


B 


0001 1 1 1 1 00 


IIQ 


subfex 


011111 


D 


A 


B 


i 


001 0001 000 




addex 


011111 


D 


A 


B 


s 






mtcrf 


011111 


S 


0 


CRM 


j| 


001001 0000 


0 


mtmsr 


011111 


S 








■ 


■ 


0010010010 


0 




011111 












0010010101 


ij 


stwcx. 


011111 


s 

■ ■ 


A 


B 




0010010110 


1 


stwx 


011111 


S 


A 


B 




0 




011111 






B 


g;i 




o1 


stwux 


011111 


s 




A 


B 


asms 


■nmgj 


00101 10111 


0] 


subfzex 


011111 


D 


A 


00000 


OE 






addzex 


011111 


D 


A 


00000 


OE 


0011001010 


B 


mtsr 


011111 


S 


0 


SR 


00000 


0011010010 


B 


stdcx.^ 


011111' 




: 

i 




. , 8 V 




V.; ' osnsi'b’iid' , - 


i 


stbx 


011111 


S 


A 




0011010111 




subfmex 




D 




A 


00000 




OE 


0011101000 












A 


' B ^ 




t!»E 


0011101001 


n4 


addmex 


011111 


■■■■■■ 

D 


A 






0011101010 


Rc 


mullwx 


011111 


D 


A 


B 


OE 


0011101011 




mtsrin 


011111 


S 




B 


0011110010 


B 


dcbtst 


011111 


00000 


A 


B 


0011110110 


Q 


stbux 


011111 


S 


A 


B 


0011110111 


HI 



Appendix A. PowerPC Instruction Set Listings 



A-11 





























Name 0 



5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



addx 
debt 
Ihzx 
eqvx 
tibie 
eciwx 
Ihzux 
xorx 
mfspr ^ 


011111 


D 


A 


B 


OE 


0100001010 


Rc 


0 111 1 1 


00000 


A 


B 


01 000101 1 0 


0 


011111 


D 


A 


B 


01 0001 0111 


II 


011111 


S 


A 


B 


010001 1100 


Rc 


011111 


00000 


00000 


B 


0100110010 


o: 


011111 


D 


A 


B 


0100110110 


0 


011111 


D 


A 


B 


0100110111 


0 


011111 


s 


A 


B 


0100111100 


Rc 


011111 


D 


spr 


0101010011 


0 




011111 


D 




8 




i 


lhax 


011111 


D 


A 


B 


0101010111 


i 




011111 




00000 

tt 


)r 


0101110011 


1 


mftb 


011111 


D 


lhaux 

sthx 

prex 


011111 




' 


B 


0101110101 


1 


011111 


D 


A 


B 


0101110111 


i 


011111 


S 


A 


B 


0110010111 


0 


011111 


S 


A 


B 


0110011100 






— 

■SBBi 




A ' 

000,00 ' 


^ ' 

Sn 

, ' 8 . 




M 

i 


ecowx 

sthux 

orx 


011111 




A 


B 


0110110110 




011111 


S 


||B|[| 


B 


0110110111 


i 


011111 


S 


A 


B 


0110111100 


Rc 






D 




0 


OS 






divwux 
mtspr 2 
debi 
nandx 


011111 


D 


A 


B 


OE 


0111001011 




011111 


S 


spr 


01 11010011 


B 


011111 


00000 


A 


B 


0111010110 


B 


011111 


S 


A 


B 


011101 1 100 






^ "a 'a a ' 

\ 1 ' 




rT 

A ^ 


r‘y^y-"p " ■■■ 


OE 


' ' A 4 1 i'l m PA i ' 




divwx 

merxr 
Iswx ^ 
Iwbrx 


011111 

011111 


D 


mniQiiiiiii 


B 


OE 


0111101011 


IS 


II 




. 00000 
00000 


oHi 110010 
1 000000000 


B 

B 


011111 


D 


A 


B 


1 00001 0101 


0 


011111 


D 


A 


B 


1 00001 0110 


0 
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011111 

011111 




011111 

011111 

011111 

011111 

011111 

011111 

011111 



stfsx 


011111 


stfsux 


011111 


jtswi ^ 


011111 


stfdx 


011111 


stfdux 


011111 


Ihbrx 


011111 


srawx 


011111 



D 

00000 

D 

D 

D 

S 

S 



S 

S 

S 

S 

S 

D 

S 



0 0 0 0 0 
A 
SR 
A 

00000 

A 

A 

00000 

A 

A 




srawix 


011111 


s 


eieio 


011111 


00000 


sthbrx 


011111 


s 


extshx 


011111 


s 


extsbx 


011111 


s 


llbld^® 


011111 


00000 


icbi 


011111 


00000 



A 

00000 

A 

A 

A 

00000 

A 

A 



00000 

B 

00000 

NB 

00000 

B 

B 

B 

B 

B 

B 

B 

NB 

B 

B 

B 

B 



SH 

00000 

B 

00000 

00000 

B 

B 

B 



1 00001 01 1 1 
1000011 000 



1000110110 
1000110111 
1001010011 
1001010101 
1001010110 
1001010111 
1001110111 
1010010011 
1010010101 
1010010110 
1010010111 
1010110111 
1011010101 
1011010111 
1011110111 
1 1 0001 0110 
1 1 0001 1 000 



1100011010 



1100111000 

1101010110 

1110010110 

1110011010 

1110111010 

1111010010 

1111010110 

1111010111 



iH 

■ 

na 

isq 

IB 

m 

IB 

IB 

IB 

B 

IB 

B 

B 

m 

B 

B 



B 

B 

lESI 




1111110010 
1111110110 
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Name 0 



5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



Iwzu 


1 00001 


D 


A 


d 


Ibz 


1 0001 0 




A 


d 


Ibzu 


1 0001 1 


D 


A 


d 


stw 


100100 


s 


A 


d 


stwu 


100101 


s 


A 


d 


stb 


100110 


s 


A 


d 


stbu 


100111 


s 


A 


d 


Ihz 


1 01 000 


D 


A 


d 


ihzu 


101001 


D 


A 


d 


lha 


101010 


D 


A 


d 


lhau 


101011 


D 


A 


d 


sth 


101100 


s 


A 


d 


sthu 


101101 


s 


A 


d 


Imw^ 


101110 


D 


A 


d 


stmw^ 


101111 


s 


A 


d 


Ifs 


1 10000 


D 


A 


d 


Ifsu 


1 1 0001 


D 


A 


d 


Ifd 


110010 


D 


A 


d 


Ifdu 


110011 


D 


A 


d 


stfs 


110100 


s 


A 


d 


stfsu 


110101 


s 


A 


d 


stfd 


110110 


s 


A 


d 


stfdu 


110111 


s 


A 




d 










111010 






== 




B 


1 


i 

'mk 


fdivsx 


111011 i 

' 


D 


A 


B 






■ 




fsubsx 


111011 


D 


A 


B 








faddsx 


111011 


D 


A 


B 


00000 




i 






111011 


D 












M 


fresx^ 


111011 


D i 


00000 


B 


00000 




■ 




fmulsx 


111011 


D 


A 


00000 


c 






fmsubsx 


111011 


D 


A 


B 


c 




B 
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Name 0 



5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



fmaddsx 

fnmsubsx 

fnmaddsx 


111011 


D 


A 


B 


C 


11101 


Rc 


111011 


D 


A 


B 


C 


11110 


Rc 


111011 


D 


A 


B 


C 


11111 


Rc 


fcmpu 

frspx 

fctiwx 

fctiwzx 

fdivx 

fsubx 

faddx 

tselx^ 

fmuix 

frsqrtex^ 

fmsubx 

fmaddx 

fnmsubx 

fnmaddx 

tempo 

mtfsbix 

fnegx 

merfs 

mtfsbOx 

fmrx 

mtfsfix 

fnabsx 

fabsx 

mffsx 

mtfsfx 


111110 
■' ,111110 ' " 






— — 


ds 




111111 


crfD 


00 


A 


B 


0000000000 


0 


111111 


D 


00000 


B 


0000001 100 


Rc 


111111 


D 


00000 


B 


0000001 1 1 0 




111111 


D 


00000 


B 


0000001 1 1 1 


Rc 


111111 


D 


A 


B 


00000 


10010 


Rc 


111111 


D 


A 


B 


00000 


10100 


Rc 


111111 

111111 


D 

D 


A 

A 


B 

B 


00000 

c 


10101 




111111 


D 


A 


00000 


c 






111111 


D 


00000 


B 


00000 


11010 


02 


111111 


D 


A 


B 


c 


11100 


02 


111111 


D 


A 


B 


c 


11101 


02 


111111 


D 


A 


B 


c 


11110 


s 


111111 


D 


A 


B 


c 


11111 


s 


111111 


crfD 


00 


A 


B 


00001 00000 


0 


111111 


crbD 


00000 1 


msm 


00001 00110 




111111 


D 


00000 1 


B 


00001 01000 


02 


111111 


CrfD 


00 


erfS 0 0 ! 


00000 


0001 000000 


0 


111111 


crbD 


00000 


IfflSSII 


0001 0001 1 0 




111111 


D 


00000 i 


B 


0001 001000 




111111 


CrfD 


00 


00000 i 


■HQ 


001 00001 1 0 


Rc 


111111 


D 




B 


001 0001 000 


02 


111111 


D 


00000 


B 


01 00001 000 


12 


1 1 1 1 1 1 


D 




■Bl 


1 001 0001 1 1 


02 


111111 


, T 


B 


1011 0001 1 1 


02 
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Name 0 



5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 




^ Supervisor-level instruction 
^ Supervisor- and user-level instruction 
^ Load and store string or multiple Instruction 
^ 64-bit instruction 

® Optional in the PowerPC architecture 
® 603e-implementation specific instruction 



A 
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A.3 Instructions Grouped by Functional Categories 

Table A-3 through Table A-30 list the PowerPC instructions grouped by function. 



Key: 

[ [ Reserved bits 




Instruction not implemented in the 603e 



Table A-3. Integer Arithmetic Instructions 



Name 


0 5 


6 7 8 9 10 


11 12 13 14 


15 


16 


17 18 19 20 


21 


22 23 24 25 26 27 28 29 30 


31 


addx 


31 


D 


A 


B 


OE 


266 


Rc 


addcx 


31 


D 


A 


B 


OE 


10 


Rc 


addex 


31 


D 


A 


B 


OE 


138 


Rc 


addi 


14 


D 


A 


SIMM 


addic 


12 


D 


A 


SIMM 


addic. 


13 


D 


A 


SIMM 


addis 


15 


D 


A 


SIMM 


addmex 


31 


D 


A 


00000 




234 


Rc 


addzex 


31 


D 


A 




00000 


OE 


202 


Rc 


r 


. 


z I 


A 


- 


— f- 






P 

i 


divwx 




D 


A 


B 


i 






divwux 


31 


D 


A 


B 


s 


459 












i 




§ 




1 

1 


mulhwxl 


31 


D 


A 


B 


B 


75 


IQ 


mulhwuxi 


31 


D 


A 


B 


B 




IQ 




31 


y ^ 
















mulli 


07 


D 


A 








SIMM 


1 


muliwx 


31 


D 


A 


B 


OE 


235 


Rc 


negx 


31 


D 


A 


00000 


OE 


104 


Rc 


subfx 


31 


D 


A 


B 


OE 


40 


Rc 


subfcx 


31 


D 


A 


B 


OE 


8 


Rc 


subficx 


08 


D 


A 


SIMM 


subfex 


31 


D 


A 


B 


OE 


136 


Rc 


subfmex 


31 


D 


A 


00000 


OE 


232 


Rc 


subfzex 


31 


D 


A 


00000 


OE 


200 


Rc 
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Table A-4. Integer Compare Instructions 



Name 


0 5 


6 7 8 


9 


10 


11 12 13 14 15 


16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


cmp 


31 


crfD 


7 


L 


A 


B 


0000000000 


0 


cmpi 


11 


crfD 


0 


L 


A 


SIMM 


cmpi 


31 


CrfD 


II 


L 


A 


B 


32 


0 


cmpli 


10 


CrfD 


7 


L 


A 


UIMM 



Table A-5. Integer Logical Instructions 



Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



andx 


31 


s 


A 


B 


28 


Rc 


andcx 


31 


s 


A 


B 


60 


Rc 


andi. 


28 


s 


A 


UIMM 


andis. 


29 


s 


A 




UIMM 




J 
















8 


cntizwx 


31 


s 


A 


00000 


26 




gg 


eqvx 


31 


s 


A 


B 


284 




extsbx 


31 


s 


A 


00000 


954 




extshx 


31 


s 


A 


00000 


S- 


____ 




nandx 


31 


s 


A 


B 


476 


. 


Rc 


norx 


31 


s 


A 


B 


124 


Rc 


orx 


31 


s 


A 


B 


444 


Rc 


orcx 


31 


s 


A 


B 


412 


Rc 


ori 


24 


s 


A 


UIMM 


oris 


25 


s 


A 


UIMM 


xorx 


31 


s 


A 


B 


316 


Rc 


xori 


26 


s 


A 


UIMM 


xoris 


27 


s 


A 


UIMM 






Table A-6. Integer Rotate Instructions 






Name 


0 5 


6 7 8 9 10 


11 12 13 14 15 


16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 




s 




9 
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rhltox^ 


$0 




A 






1 




30 


— 


A ' 






3 


rlwimix 


22 


s 


A 


SH 


MB 


ME 


Rc 


rlwinmx 


20 


s 


A 


SH 


MB 


ME 


Rc 


rlwnmx 


21 


s 


A 


SH 


MB 


ME 


Rc 



Table A<7. Integer Shift Instructions 




Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
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Table A-9. Floating-Point Multiply-Add instructions 



Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



fmaddx 


63 


D 


A 


B 


C 


29 


Rc 


fmaddsx 


59 


D 


A 


B 


C 


29 


Rc 


fmsubx 


63 


D 


A 


B 


C 


28 


Rc 


fmsubsx 


59 


D 


A 


B 


C 


28 


Rc 


fnmaddx 


63 


D 


A 


B 


C 


31 


Rc 


fnmaddsx 


59 


D 


A 


1 

B 


C 


31 


Rc 


fnmsubx 


63 


D 


A 


B 


C 


30 


Rc 


fnmsubsx 


59 


D 


A 


B 


C 


30 


Rc 



Table A-10. Floating-Point Rounding and Conversion Instructions 

Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



' 

fctiwx 

fctiwzx 

frspx 



, ® 




ooQO'b ; 1 


' 




m 


^ ' 


' 0 


' A A A A ' "' 


M 'i ' 




m 


'ea-' ' 


0 




' ' ' B'" ' ' 


' ‘ f' ’ , 

SIS 


m 


63 


D 


00000 


— 
B 1 


14 


Rc 


63 


D 


00000 


B 


15 


Rc 


63 


D 


00000 


B 


12 


Rc 



Table A-11. Floating-Point Compare Instructions 



Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



fcmpo 


63 


crfD 


illl 


A 


B 


32 


ill 


fcmpu 


63 


crfD 


00 


A 


B 


0 


II 



Table A-12. Floating-Point Status and Control Register Instructions 



Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



mcrfs 


63 


crfD 


00 


crfS 


00 


ooooo 


64 


0 


mffsx 


63 


D 


ooooo 


ooooo 


583 


Rc 


mtfsbOx 


63 


crbD 


00000 


ooooo 


70 


Rc 


mtfsbix 


63 


crbD 


ooooo 


ooooo 


38 


Rc 


mtfsfx 


31 


0 


FM 


n 


B 


711 


Rc 


mtfsfix 


63 


crfD 


00 


ooooo 


!MM 


I 


134 


Rc 
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Table A-13. Integer Load Instructions 



Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
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St ^ 


g 1 


A 






0 


sth 


44 


S 


A 




d 


sthu 


45 


S 


A 


d 


sthux 


31 


s 


A 


B 


439 




0 


sthx 


31 


s 


A 


B 


407 


II 


stw 


36 


s 


A 


d 


stwu 


37 


s 


A 


d 


stwux 


31 


s 


A 


B 


183 


11 


stwx 


31 


s 


A 1 

i 


B 


151 


II 


TableA-15. 


Integer Load and Store with Byte-Reverse Instructions 




Name 


0 5 


6 7 8 9 10 


11 12 13 14 15 


16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


Ihbrx 


31 


D 


A 


B 


790 


0 


Iwbrx 


31 


D 


A 


B 


534 


0 


sthbrx 


31 


S 


A 


B 


918 


0 


stwbrx 


31 


S 


A 


B 


662 


0 



Table A-16. Integer Load and Store Multiple Instructions 



Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



46 


D 


A 


cl 


47 


S 


A 


d 



Table A-17. Integer Load and Store String Instructions 



Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



iswi ^ 


31 


D 


A 


NB 


597 


0 


Iswx^ 


31 


D 


A 


B 


533 


0 


stswi ^ 


31 


S 


A 


NB 


725 


0 


stswx ^ 


31 


S 


A 


B 


661 


0 



Tabie A-18. Memory Synchronization Instructions 



Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
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Table A-21. Floating-Point Move Instructions 



Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



fabsx 


63 


D 


00000 


B 


264 


Rc 


fmrx 


63 


D 


00000 


B 


72 


Rc 


fnabsx 


63 


D 


00000 


B 


136 


Rc 


fnegx 


63 


D 


00000 


B 


40 


Rc 



Table A-22. Branch Instructions 



Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



bx 


18 


LI 


AA 


LK 


bcx 


16 


BO 


Bl 


BD 


AA 


LK 


bcctrx 


19 


BO 


Bi 


00000 


528 


LK 


bcirx 


19 


BO 


Bl 


00000 


16 


LK 



Table A-23. Condition Register Logicai Instructions 



Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



crand 


19 


crbD 


crbA 


crbB 


257 


0 


crandc 


19 


crbD 


crbA 


crbB 


129 


0 


creqv 


19 


crbD 


crbA 


crbB 


289 


0 


crnand 


19 


crbD 


crbA 


crbB 


225 


0 


crnor 


19 


crbD 


crbA 


crbB 


33 


0 


cror 


19 


crbD 


crbA 


crbB 


449 


0 


crorc 


19 


crbD 


crbA 


crbB 

1 


417 


0 


crxor 


19 


crbD 


crbA 


crbB 


193 


n 


mcrf 


19 


crfD 


00 


crfS 


00 


00000 


0000000000 


0 



Tabie A-24. System Linkage Instructions 



Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



rfl^ 


19 


00000 


00000 


00000 


50 


0 


sc 


17 


00000 


00000 


000000000000000 


1 


0 
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Table A-25. Trap Instructions 



Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
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Table A-29. Lookaside Buffer Management Instructions 



Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 







1 

i 


tlbie^’^ 31 00000 00000 B 


306 


0 


tibid'® 31 00000 00000 B 


978 


0 


tibll’® 31 00000 00000 B 


1010 


0 


tlbsync^® 31 00000 00000 OOOOO 


566 


0 


Table A-30. External Control Instructions 

Name O 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


eciwx 31 D A B 


310 


H 


ecowx 31 S A B 


438 




n 



^ Supervisor-level Instruction 
^ Supervisor- and user-level Instruction 
^ Load and store string or multiple instruction 
^ 64-bit instruction 

® Optional in the PowerPC architecture 
® 603e-implementatlon specific Instruction 
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A.4 Instructions Sorted by Form 

Table A-31 through Table A-45 list the PowerPC instructions grouped by form. 

Key: 

Reserved bits Fffiifj Instruction not implemented in the 603e 




Table A-31 . 1-Form 




Specific Instruction 

Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 




Table A-32. B-Form 




Specific Instruction 

Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 




Table A-33. SC-Form 




Specific Instruction 

Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 




Table A-34. D-Form 



OPCD 


D 


A 


d 


OPCD 


D 


A 


SIMM 


OPCD 


S 


A 


d 


OPCD 


S 


A 


UIMM 


OPCD 


crfD 0 L 


A 


SIMM 


OPCD 


crfD 0 L 


A 


UIMM 


OPCD 


TO 


A 


SIMM 
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A-28 



PowerPC 603e RISC Microprocessor User's Manual 







Table A-35. DS-Form 



OPCD 


D 


A 


ds 


XO 


OPCD 


S 


A 


ds 


XO 



Specific Instructions 

Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 




Table A-36. X-Form 



OPCD 


D 


A 


B 


XO 


0 


OPCD 


D 


A 


NB 


XO 


0 


OPCD 


D 


00000 


B 


XO 


0 


OPCD 


D 


00000 


00000 


XO 


0 


OPCD 


D 


III 


SR 


00000 


XO 


0 


OPCD 


S 


A 


B 


XO 


Rc 


OPCD 


S 


A 


B 


XO 


1 


OPCD 


S 


A 


B 


XO 


0 


OPCD 


S 


A 


NB 


XO 


0 


OPCD 


S 


A 


00000 


XO 


Rc 


OPCD 


S 


00000 


B 


XO 


0 


OPCD 


S 


00000 


00000 


XO 


0 


OPCD 


S 


0 


SR 


00000 


XO 


0 


OPCD 


s 


A 


SH 


XO 


Rc 


OPCD 


crfD 


g 


D 


A 


B 


XO 


i 


OPCD 


crfD 




A 


B 


XO 


0 
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fabsx 



63 



tempo 



fempu 











































Appendix A. PowerPC Instruction Set Listings 



A-31 






mfmsr ^ 


31 


D 


00000 


00000 


83 


II 


mfsr ^ 


31 


D 


Bi 


SR 




595 


I 


mfsrin ^ 


31 


D 


00000 


CD 


659 


I 


mtfsbOx 


63 


crbD 


00000 


00000 


70 


I 1 I 2 


mtfsbix 


63 


crfD 


00000 


00000 


38 


Rc 


mtfsfix 


63 


crbD 


00 


00000 


IMM 


i 


134 


Rc 


mtmsr ^ 


31 


S 


00000 


00000 


146 


Q 


mtsr ^ 


31 


S 


0 


SR 


00000 


210 


i 


mtsrin ^ 


31 


S 


00000 


B 


242 


3 


nandx 


31 


S 


A 


B 


476 


B 


norx 


31 


S 


A 


B 


124 


llQ 


orx 


31 


S 


A 


B 


444 




orcx 


31 


S 


HDB 


B 


412 




sWx^ 


'31 

ILI 


DOOOO 




'00/6o:~' 


'00000 




498 


w 


31 

$1 




y" ' 




■ 


434 


i 


siwx 


31 


S 


A 




24 


Rc 




31 


s 


Bill 


' ' "" A ' 






794 


ZP' 

Rc 


srawx 


31 


s 






A 


B 


792 


llQ 


srawix 


31 


s 


A 

‘ 


SH 


824 








s 












i 


srwx 


31 


s 


A 


B 


536 




stbux 


31 


S 


A 


B 


247 


0 


stbx 


31 


s 


A 


B 


215 


B 


tKdux^ 

stfdux 


31 

31 

! 


s 


A 


B 


«. n 

759 


B 

B 

B 

M 


stfdx 


31 


s 


A 


B 


727 


0 


stfiwx® 


31 




A 


B 


983 


0 


stfsux 


31 


s 


A 


B 


695 


0 


stfsx 


31 


s 


A 


B 


663 


0 


sthbrx 


31 


s 


A 


B 


918 


0 


sthux 


31 


s 


A 


B 


439 


0 


sthx 


31 


s 


A 


B 


407 


0 
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B 
B 
B 

00000 
B 
B 

Table A-37. XL-Form 



OPCD 


BO 


BI 


00000 


XO 


OPCD 


crbD 


crbA 


crbB 


XO 


OPCD 


crfD 0 0 


crfS 0 0 


00000 


XO 


OPCD 


00000 


00000 


00000 


XO 




Specific Instructions 

5 6/7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
BO Bi 00000 528 LK 

BO BI 00000 16 LH 



00000 
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19 


00000 


00000 


00000 


50 


■ 



Table A-38. XFX-Form 



OPCD 


D 


spr 


XO 


0 


OPCD 


D 


J CRM III 


xo 


0 


OPCD 


S 


spr 


XO 


'o 


OPCD 


D 


tbr 


XO 


0 



Specific instructions 

Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



mfspr ^ 


31 


D 


spr 


339 


D 


mftb 


31 


D 


tbr 


371 


0 


mtcrf 


31 


S 


n 


CRM 


M 


144 


0 


mtspr ^ 


31 


D 


spr 


467 


0 



Table A-39. XFL-Form 



OPCD 


1 


FM 


II 


B 


XO 


Rc 



Name 0 



mtfsfx 



OPCD 


S 


A 


sh 


XO 


sh 


Rc 



Specific Instructions 

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



■ 


FM 


0 


B 


711 


Rc 


ate 













Table A-40. XS-Form 



Specific instructions 

Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 




Table A-41.XO-Form 



OPCD 


D 


A 


B 


OE 


XO 


Rc 


OPCD 


D 


A 


B 


0 


XO 


Rc 


OPCD 


D 


A 


00000 


OE 


XO 


Rc 



Specific instructions 

Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 

addx 
addcx 
addex 
addmex 



31 


D 


A 


B 


OE 


266 


Rc 


31 


D 


A 


B 


OE 


10 


Rc 


31 


D 


A 


B 


OE 


138 


Rc 


31 


D 


A 


00000 


OE 


234 


Rc 
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addzex 


31 


D 


A 


00000 




202 






MbMi 


"o' 




' ' ft • ' " 

8 ,, , 

rt 

& 


OE 

oi 


'/ 45T - 


I 


divwx 


31 


D 


A 


B 




491 




divwux 


31 


D 


A 


B 


s 


459 


IQ 








" " ' ^ ' ' " ' 




i 




m 

s 


muihwx 


31 


D 


A 


B 


I 


75 


Q 


mulhwux 


31 


D 


A 


B 


0 


11 


Q 






D 




B 


Oi 






m 


muilwx 


31 


D 


A 


B 


OE 


235 


Rc 


negx 


31 


D 


A 


00000 


OE 


104 


Rc 


subfx 


31 


D 


A 


B 


OE 


40 


Rc 


subfcx 


31 


D 


A 


B 


OE 


8 


Rc 


subfex 


31 


D 


A 


B 


OE 


136 


Rc 


subfmex 


31 


D 


A 


00000 


OE 


232 


Rc 


subfzex 


31 


D 


A 


00000 


OE 


200 


Rc 








Table A-42. 


A-Form 














OPCD 


D 


A 


B 


00000 


XO 


Rc 




OPCD 


D 


A 


B 


C 


XO 


Rc 




OPCD 


D 


A 


00000 


C 


XO 


Rc 




OPCD 


D 


00000 


B 


ooooo 


XO 


Rc 



Specific Instructions 

Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



faddx 


63 


D 


A 


B 


OOOOO 


21 


Rc 


faddsx 


59 


D 


A 


B 


ooooo 


21 


Rc 


fdivx 


63 


D 


A 


B 


ooooo 


18 


Rc 


fdivsx 


59 


D 


A 


B 


ooooo 


18 


Rc 


fmaddx 


63 


D 


A 


B 


c 


29 


Rc 


fmaddsx 


59 


D 


A 


B 


c 


29 


Rc 


fmsubx 


63 


D 


A 




c 


28 


Rc 


fmsubsx 


59 


D 


A 


B 


c 


28 


Rc 


fmulx 


63 


D 


A 


ooooo 


c 


25 


Rc 


fmulsx 


59 


D 


A 


ooooo 


c 


25 


Rc 
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fnmaddx 


63 


fnmaddsx 


59 


fnmsubx 


63 


fnmsubsx 


59 


fresx® 


59 


frsqrtex® 


63 


fseix® 


63 




Name 0 
rlwimix 
rlwinmx 
rlwnmx 



A 

A 

A 

A 

00000 

00000 

A 



C 

C 

C 

C 

00000 

00000 

C 



BS 

E5 

liQ 

02 




63 


D 


A 


59 


D 


A 



00000 

00000 



20 


Rc 


20 


Rc 



Table A-43. M-Form 




Specific Instructions 

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



20 


S 


A 


SH 


MB 


ME 


Rc 


21 


S 


A 


SH 


MB 


ME 


Rc 


23 


S 


A 


B 


MB 


ME 


Rc 




Table A-44. 


MD-Form 








OPCD 


s 


A 


sh 


mb 


XO 


sh Rc 


OPCD 


s 


A 


sh 


me 


XO 


sh Rc 






Specific Instructions 








0 5 


6 7 8 9 10 


11 12 13 14 15 


16 17 18 19 20 


21 22 23 24 25 


26 27 28 29 30 31 
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Table A-45. MDS-Form 



OPCD 


S 


A 


B 


mb 


XO 


Rc 


OPCD 


S 


A 


B 


me 


XO 


Rc 



specific Instructions 

Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 




^ Supervisor-level instruction 
^ Supervisor- and user-level instruction 
^ Load and store string or multiple Instruction 
^ 64-bit instruction 

® Optional in the PowerPC architecture 
® 603e-implementation specific instruction 
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A.5 Instruction Set Legend 

Table A-46 provides general information on the PowerPC instruction set (such as the 
architectural level, privilege level, and form). 

Key: 

I I Reserved bits Instruction not implemented In the 603e 

Table A-46. PowerPC Instruction Set Legend 



UISA VEA OEA Supervisor Level 64-Bit Optional Form 
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dcbi 



V 












XL 


V 












XL 


V 












XL 


V 












XL 


V 












XL 




V 










X 






yi 


>/ 






X 




yl 










X 




V 










X 




V 










X 




V 








— 


X 








1 




y} 












xo 


V 
















V 








yj 


X 












yj 


X 




V 










X 


V 












X 


: ^/ 












X 


:| 










= 


X 








i 






V 








1 


X 


V 












A 


V 












A 














V 










xj 


V 


— 


— 


_ — _ 




ii 


Pi 


• V 


— 


— 


- — 


_jL_ 


V 












X 


V 












A 


V 












A 


V 












A 
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UlSA VEA OEA Supervisor Level 64-blt Optional Form 




Appendix A. PowerPC Instruction Set Listings 



A-41 










UlSA 


VEA 




OEA 


Supervisor Level 


64-bit 


Optional 


Form 


mftb 




V 










XFX 


mtcrf 


V 












XFX 


mtfsbOx 


V 












X 


mtfsbix 


V 












X 


mtfsfx 


V 












XFL 


mtfsfix 


V 












X 


mtmsr ^ 






V 


V 






X 


mtspr 2 


V 




V 


V 






XFX 


mtsr^ 






V 


V 






X 


mtsrin ^ 






V 


V 






X 








1 












mulhwx 


V 












xo 


mulhwux 


V 














xo 








■ 












mulli 


V 














D 


mullwx 


V 












XO 


nandx 


V 













X 


negx 


V 




. . 

. 








XO 


norx 


V 












X 


orx 


V 












X 


orcx 


V 












X 


ori 


V 












D 


oris 


V 












D 


rfi ^ 






V 


V 






XL 


1 








s 








H 


rlwimix 


V 












M 


rlwinmx 


V 












M 


rlwnmx 


V 












M 








HI 


HHHHI 


HHHHHHHHi 


HHHHHH 


HHHHHi 


HHHHHN 
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UlSA 


VEA 


OEA 


Supervisor Level 


64-bit 


Optional 


Form 


sc 


V 




V 








SC 


«tbl9 


r 

' ' ' > 
;:J:' 


' 




liiii 




j ' , 

' ' 






siwx 


V 


















• •- 








; ’ • _' ‘ 




srawx 


V 












X 


srawix 


V 












X 


















srwx 


V 












X 


stb 


V 












D 


stbu 


< 












D 


stbux 


V 












X 


stbx 


V 












X 


1'. ' "std ^ 

\- «ldu^i 
1' stdux^i 




'' 

' 

' 


9 




— _ 
r,,.,. 1 

^ 

. " i 





"''^'4 "i 


stfd 


V 












D 


stfdu 


V 












D 


stfdux 














X 


stfdx 














X 


stfiwx ® 












V 


X 


stfs 














D 


stfsu 


V 












D 


stfsux 


V 












X 


stfsx 


V 












X 


sth 


V 












D 


sthbrx 


V 












X 


sthu 


V 












D 


sthux 


V 












X 


sthx 


< 

1 












X 
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stmw^ 


V 












D 


stswl ^ 


V 












X 


stswx^ 


V 












X 


stw 


V 












D 


stwbrx 


V 












X 


stwcx. 


V 




. 








X 


stwu 


a/ 












D 


stwux 


V 












X 


stwx 


V 












X 


subfx 


V 












xo 


subfcx 


V 






. 






xo 


subfex 


V 












xo 


subfic 


V 












D 


subfmex 


V 












XO 


subfzex 


V 












XO 


sync 


V 












X 


1 


J 

^ 1 




1 

< - 


bM 


iiiiiiiiiiii 






tlble^® 






V 


V 




V 


X 


tibid’® 

■ 








V 






X 


tlbll^® 








V 






X 


tibsync 






V 


V 






X 


tw 


V 












X 


twi 


V 












D 


xorx 


V 












X 


xori 


V 




1 

1 








D 


xoris 


V 




1 








D 



^ Supervisor-level instruction 
^ Supervisor- and user-level instruction 
^ Load and store string or multiple instruction 
^ 64-bit instruction 

® Optional in the PowerPC architecture 
® 603e-implementation specific instruction 
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Appendix B 

Instructions Not Implemented 

This appendix provides a list of the 32-bit and 64-bit PowerPC instructions that are not 
implemented in the PowerPC 603e microprocessor. It also provides the 64-bit SPR 
encoding that is not implemented by the 603e. Note that any attempt to execute instructions 
that are not implemented on the 603e will generate an illegal instruction exception. Note 
that exceptions are referred to as interrupts in the architecture specification. 

Table B-1 provides the 32-bit PowerPC instructions that are optional to the PowerPC 
architecture but not implemented by the 603e. 

Table B-1. 32-Bit Instructions Not Implemented by the PowerPC 603e 

Microprocessor 



Mnemonic 


Instruction 


fsqrt 


Floating Square Root (Double-Precision) 


fsqrts 


Floating Square Root Single 


tibia 


TLB Invalidate All 



Table B-2 provides a list of 64-bit instructions that are not implemented by the 603e. 

Table B-2. 64-Bit Instructions Not Implemented by the PowerPC 603e 

Microprocessor 



Mnemonic 


Instruction 


cntizd 


Count Leading Zeros Double Word 


divd 


Divide Double Word 


divdu 


Divide Double Word Unsigned 


extsw 


Extend Sign Word 


fetid 


Floating Convert From Integer Double Word 


fetid 


Floating Convert to Integer Double Word 


fetidz 


Floating Convert to Integer Double Word with Round toward Zero 


id 


Load Double Word 


Idarx 


Load Double Word and Reserve Indexed 
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Table B-2. 64-Bit 



Instructions Not Implemented by the PowerPC 603e 
Microprocessor (Continued) 



Mnemonic 


Instruction 


idu 


Load Double Word with Update 


Idux 


Load Double Word with Update Indexed 


Idx 


Load Double Word Indexed 


Iwa 


Load Word Algebraic 


Iwaux 


Load Word Algebraic with Update Indexed 


iwax 


Load Word Algebraic Indexed 


mulld 


Multiply Low Double Word 


mulhd 


Multiply High Double Word 


mulhdu 


Multiply High Double Word Unsigned 


ridcl 


Rotate Left Double Word then Clear Left 


rider 


Rotate Left Double Word then Clear Right 


ridic 


Rotate Left Double Word Immediate then Clear 


ridici 


Rotate Left Double Word Immediate then Clear Left 


rldicr 


Rotate Left Double Word Immediate then Clear Right 


ridimi 


Rotate Left Double Word Immediate then Mask Insert 


sibia 


1 

SLB Invalidate All 

1 


sibie 


SLB Invalidate Entry 


sId 


Shift Left Double Word 


srad 


Shift Right Algebraic Double Word 


sradi 


Shift Right Algebraic Double Word Immediate 


srd 


Shift Right Double Word 


std 


Store Double Word 


stdex. 


Store Double Word Conditional Indexed 


stdu 


Store Double Word with Update 


stdux 


Store Double Word Indexed with Update 


stdx 


Store Double Word Indexed 


td 


Trap Double Word 


tdi 


Trap Double Word Immediate 
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Table B-3 provides the 64-bit SPR encoding that is not implemented by the 603e. 



Table B-3. 64-Bit SPR Encoding Not Implemented by the PowerPC 603e 

Microprocessor 
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Appendix C 

PowerPC 603 Processor System Design 
and Programming Considerations 

While the PowerPC 603 microprocessor shares most of the attributes of the PowerPC 603e 
microprocessor, the system designer or programmer should keep in mind the 603 hardware 
and software differences, described in the following sections, that can require modifications 
to accommodate the 603 in systems designed for the 603e. 

C.1 PowerPC 603 Microprocessor Hardware 
Considerations 

The 603 ’s hardware implementation differs from the 603e in the following ways: 

• XATS signal replaces CSEl signal 

• Hardware support for access to direct-store segments 

• Bus clock multipliers of 1:1, 2:1, 3:1, and 4:1 only 

• 8-Kbyte, two-way set associative instruction and data caches 

• HIDl register not implemented in 603 

The following sections provide further information on the operation of some of the 
hardware features specific to the 603. 

C.1 .1 Hardware Support for Direct-Store Accesses 

The 603 provides hardware support for direct-store bus accesses through the provision of 
the extended address transfer start (XATS) signal, and support for direct-store accesses in 
the bus interface unit. Direct-store accesses are invoked when a segment register T bit is set 
to 1. 

The operation of the XATS signal is described in the following section. The XATS signal 
is in the same location as the CSEl signal on the 603e. 
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C.1 .1 .1 Extended Address Transfer Start (XATS) 

The XATS signal is both an input and an output signal on the 603. 



C.1 .1 .1 .1 Extended Address Transfer Start (XATS) — Output 

Following are the state meaning and timing comments for the XATS output signal. 

State Meaning Asserted — Indicates that the 603 has begun a direct-store operation 

and that the first address cycle is valid. When asserted with the 
appropriate XATC signals it is also an implied data bus request for 
certain direct-store operation (unless it is an address-only operation). 

Negated — Is negated during an entire memory transaction. 

Timing Comments Assertion — Coincides with the assertion of ABB. 

Negation — Occurs one bus clock cycle after the assertion of XATS. 

High Impedance — Coincides with the negation of ABB. 

C.1 .1 .1 .2 Extended Address Transfer Start (XATS) — Input 

Following are the state meaning and timing comments for the XATS input signal. 

State Meaning Asserted — Indicates that the 603 must check for a direct-store 

operation reply. 

Negated — Indicates that there is no need to check for a direct-store 
operation reply. 

Timing Comments Assertion— May occur while ABB is asserted. 

Negation— Must occur one bus clock cycle after XATS is asserted. 

C.1. 2 Direct-Store Protocol Operation 

The 603 defines separate memory-mapped and I/O address spaces, or segments, 
distinguished by the corresponding segment register T bit in the address translation logic 
of the 603. If the T bit is cleared, the memory reference is a normal memory-mapped access 
and can use the virtual memory management hardware of the 603. If the T bit is set, the 
memory reference is a direct-store access. 

The following points should be considered for direct-store accesses: 

• The use of direct-store segment accesses may have a significant impact on the 
performance of the 603. The provision of direct-store segment access capability by 
the 603 is to provide compatibility with earlier hardware I/O controllers and may not 
be provided in future derivatives of the 603 family. 

• Direct-store accesses are strongly ordered; for example, these accesses occur on the 
bus strictly in order with respect to the instruction stream. 

• Direct-store accesses provide synchronous error reporting. 

The 603 has a single bus interface to support both memory accesses and direct-store 
segment accesses. 
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The direct-store protocol for the 603 allows for the transfer of 1 to 128 bytes of data 
between the 603 and the bus unit controller (BUC) for each single load or store request 
issued by the program. The block of data is transferred by the 603 as multiple single-beat 
bus transactions (individual address and data tenure for each transaction) until completion. 
The program waits for the sequence of bus transactions to be completed so that a final 
completion status (error or no error) can be reported precisely with respect to the program 
flow. The completion status is snooped by the 603 from a bus transaction run by the BUC. 

The system recognizes the assertion of the TS signal as the start of a memory-mapped 
access. The assertion of XATS indicates a direct-store access. This allows memory-mapped 
devices to ignore direct-store transactions. If XATS is asserted, the access is to a direct- 
store space and the following extensions to the memory access protocol apply: 

• A new set of bus operations are defined. The transfer type, transfer burst, and 
transfer size signals are redefined for direct-store operations; they convey the 
opcode for the I/O transaction (see Table C-1). 

• There are two beats of address for each direct-store transfer. The first beat (packet 0) 
provides basic address information such as the segment register and the sender tag 
and several control bits; the second beat (packet 1) provides additional addressing 
bits from the segment register and the logical address. 

• The TTQ-TT3, TBST, and TSIZ0-TSIZ2 signals are remapped to form an 8-bit 
extended transfer code (XATC) which specifies a command and transfer size for the 
transaction. The XATC field is driven and snooped by the 603 during direct-store 
transactions. 

• Only the data signals such as DH0-DH3 1 and DP0-DP3 are used. The lower half of 
the data bus and parity is ignored. 

• The sender that initiated the transaction must wait for a reply from the receiver bus 
unit controller (BUC) before starting a new operation. 

• The 603 does not burst direct-store transactions. All direct-store transactions 
generated by the 603 are single-beat transactions of 4 bytes or less (single data beat 
tenure per address tenure). 

Direct-store transactions use separate arbitration for the split address and data buses and 
define address-only and single-beat transactions. The address-retry vehicle is identical, 
although there is no hardware coherency support for direct-store transactions. The ARTRY 
signal is useful, however, for pacing 603 transactions, effectively indicating to the 603 that 
the BUC is in a queue-full condition and cannot accept new data. 
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In addition to the extensions noted above, there are fundamental differences between 
memory-mapped and direct-store operations. For example, only half of the 64-bit data path 
is available for 603 direct-store transactions. This lowers the pin count for I/O interfaces 
but generally results in substantially less bandwidth than memory-mapped accesses. 
Additionally, load/store instructions that address direct-store segments cannot complete 
successfully without an error-free reply from the addressed BUG. Because normal direct- 
store accesses involve multiple I/O transactions (streaming), they are likely to be very long 
latency instructions; therefore, direct-store operations usually stall 603 instruction issue. 

Figure C-1 shows a direct-store tenure. Note that the I/O device response is an address-only 
bus transaction. 

It should be noted that in the best case, the use of the 603 direct-store protocol degrades 
performance and requires the addressed controllers to implement 603 bus master capability 
to generate the reply transactions. 



ADDRESS TENURE I/O RESPONSE 




Figure C-1. Direct-Store Tenures 



C.1.2.1 Direct-Store Transactions 

The 603 defines seven direct-store transaction operations, as shown in Table C-1. These 
operations permit communication between the 603 and BUCs. A single 603 store or load 
instruction (that translates to a direct-store access) generates one or more direct-store 
operations (two or more direct-store operations for loads) from the 603 and one reply 
operation from the addressed BUG. 



Table C-1. Direct-Store Bus Operations 



Operation 


Address Only 


Direction 


XATC Encoding 


Load start (request) 


Yes 


603 =► ID 


0100 0000 


Load immediate 


No 


603 => 10 


0101 0000 


Load last 


No 


603 =❖ lO 


0111 0000 
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Table C-1. Direct-Store Bus Operations (Continued) 



Store immediate 


No 


603 => 10 


0001 0000 


Store last 


No 


603=>IO 


0011 0000 


Load reply 


Yes 


10 =» 603 


1100 0000 


Store reply 


Yes 


10 =>603 


1000 0000 



For the first beat of the address bus, the extended address transfer code (XATC) contains 
the I/O opcode as shown in Table C-1; the opcode is formed by concatenating the transfer 
type, transfer burst, and transfer size signals defined as follows: 

XATC = TT[0:3]IITBSTIITSIZ[0:2] 

C.1. 2.1.1 Store Operations 

There are three operations defined for direct-store store operations from the 603 to the 
BUC, defined as follows: 

1. Store immediate operations transfer up to 32 bits of data each from the 603 to the 
BUC. 

2. Store last operations transfer up to 32 bits of data each from the 603 to the BUC. 

3. Store reply from the BUC reveals the success/failure of that direct-store access to 
the 603. 

A direct-store store access consists of one or more data transfer operations followed by the 
I/O store reply operation from the BUC. If the data can be transferred in one 32-bit data 
transaction, it is marked as a store last operation followed by the store reply operation; no 
store immediate operation is involved in the transfer, as shown in the following sequence: 

STORE LAST (from 603) 



STORE REPLY (from BUC) 

However, if more data is involved in the direct-store access, there will be one or more store 
immediate operations. The BUC can detect when the last data is being transferred by 
looking for the store last opcode, as shown in the following sequence: 

STORE IMMEDIATE(s) 



STORE LAST 



STORE REPLY 
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C.1 .2.1 .2 Load Operations 

Direct-store load accesses are similar to store operations, except that the 603 latches data 
from the addressed BUG rather than supplying the data to the BUG. As with memory 
accesses, the 603 is the master on both load and store operations; the external system must 
provide the data bus grant to the 603 when the BUG is ready to supply the data to the 603. 

The load request direct-store operation has no analogous store operation; it informs the 
addressed BUG of the total number of bytes of data that the BUG must provide to the 603 
on the subsequent load immediate/load last operations. For direct-store load accesses, the 
simplest, 32-bit (or fewer) data transfer sequence is as follows: 

LOAD REQUEST 



LOAD LAST 



LOAD REPLY(from BUG) 

However, if more data is involved in the direct-store access, there will be one or more load 
immediate operations. The BUG can detect when the last data is being transferred by 
looking for the load last opcode, as seen in the following sequence: 

LOAD REQUEST 



LOAD IMM(s) 



LOAD LAST 



LOAD REPLY 

Note that three of the seven defined operations are address-only transactions and do not use 
the data bus. However, unlike the memory transfer protocol, these transactions are not 
broadcast from one master to all snooping devices. The direct-store address-only 
transaction protocol strictly controls communication between the 603 and the BUG. 
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C.1.2.2 Direct-Store Transaction Protocol Details 

As mentioned previously, there are two address-bus beats corresponding to two packets of 
information about the address. The two packets contain the sender and receiver tags, the 
address and extended address bits, and extra control and status bits. The two beats of the 
address bus (plus attributes) are shown at the top of Figure C-2 as two packets. The first 
packet, packet 0, is then expanded to depict the XATC and address bus information in 
detail. 

C,1 .2.2.1 Packet 0 

Figure C-2 shows the organization of the first packet in a direct-store transaction. 

The XATC contains the I/O opcode, as discussed earlier and as shown in Table C-1. The 
address bus contains the following: 

Key bit II segment register II sender tag 




Reserved 



Figure C-2. Direct-Store Operation— Packet 0 

This information is organized as follows: 

• Bits 0 and 1 of the address bus are reserved — ^the 603 always drives these bits to 
zero. 

• Key bit— Bit 2 is the key bit from the segment register (either SR[Kp] or SR[Ks]). 
Kp indicates user-level access and Ks indicates supervisor-level access. The 603 
multiplexes the correct key bit into this position according to the current operating 
context (user or supervisor). (Note that user- and supervisor-level refer to problem 
and privileged state, respectively, in the architecture specification.) 

• Segment register — ^Address bits 3-27 correspond to bits 3-27 of the selected 
segment register. Note that address bits 3-11 form the 9-bit receiver tag. Software 
must initialize these bits in the segment register to the ID of the BUC to be 
addressed; they are referred to as the BUID (bus unit ED) bits. 
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• PID (sender tag) — ^Address bits 28-31 form the 4-bit sender tag. The 603 PID 
(processor ID) comes from bits 28-31 of the 603 ’s processor ID register. The 4-bit 
PID tag allows a maximum of 16 processor IDs to be defined for a given system. If 
more bits are needed for a very large multiprocessor system, for example, it is 
envisioned that the second-level cache (or equivalent logic) can append a larger 
processor tag as needed. The BUG addressed by the receiver tag should latch the 
sender address required by the subsequent I/O reply operation. 

C.1 .2.2.2 Packet 1 

The second address beat, packet 1, transfers byte counts and the physical address for the 
transaction, as shown in Figure C-3. 



r 

0 7 

1 XATC I + 
Byte Count 



addr + -h |j|| <^kti> 

A , 

0 3 4 31 

SR(28-3lj Bus Address | 

Address Bus (A0-A31) 



Figure C-3. Direct-Store Operation — Packet 1 

For packet 1, the XATC is defined as follows: 

• Load request operations — ^XATC contains the total number of bytes to be transferred 
(128 bytes maximum for 603). 

• Immediate/last (load or store) operations — ^XATC contains the current transfer byte 
count (1 to 4 bytes). 



Address bits 0-31 contain the physical address of the transaction. The physical address is 
generated by concatenating segment register bits 28-31 with bits 4-31 of the effective 
address, as follows: 

Segment register (bits 28-31) II effective address (bits 4-31) 

While the 603 provides the address of the transaction to the BUG, the BUG must maintain 
a valid address pointer for the reply. 



C.1 .2.3 I/O Reply Operations 

BUCs must respond to 603 direct-store transactions with an I/O reply operation, as shown 
in Figure C-4. The purpose of this reply operation is to inform the 603 of the success or 
failure of the attempted direct-store access. This requires the system direct-store slave to 
have 603 bus mastership capability — a substantially more complex design task than bus 
slave implementations that use memory-mapped I/O access. 

Reply operations from the BUG to the 603 are address-only transactions. As with packet 0 
of the address bus on 603 direct-store operations, the XATG contains the opcode for the 
operation (see Table G-1). Additionally, the I/O reply operation transfers the 
sender/receiver tags in the first beat. 
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Address Bus (A0-A31) 
A 



— ^ 
27 28 31 



r 

0 7 0 123 

1 XATC 1 + I I I 
I/O Opcode 



1112 



"bui^ 






BUC^ecific 



PID 






Error 

Bit 



Reserved 



V 

Segment Register 



Figure C-4. I/O Reply Operation 

The address bits are described in Table C-2. 

Table C-2, Address Bits for I/O Reply Operations 



Address Bits 


Description 


0-1 


Reserved. These bits should be cleared for compatibility with future PowerPC microprocessors. 


2 


Error bit. It is set if the BUG records an error in the access. 


3-11 


BUID. Sender tag of a reply operation. Corresponds with bits 3-11 of one of the 603 segment 
registers. 


12-27 


Address bits 12-27 are BUC-specific and are ignored by the 603. 


28-31 


PID (receiver tag). The 603 effectively snoops operations on the bus and, on reply operations, 
compares this field to bits 28-31 of the PID register to determine if it should recognize this I/O reply. 



The second beat of the address bus is reserved; the XATC and address buses should be 
driven to zero to preserve compatibility with future protocol enhancements. 

The following sequence occurs when the 603 detects an error bit set on an I/O reply 
operation: 

1 . The 603 completes the instruction that initiated the access. 

2. If the instruction is a load, the data is forwarded to the register file(s)/sequencer. 

3. A direct- store error exception is generated, which transfers 603 control to the direct- 
store error exception handler to recover from the error. 

If the error bit is not set, the 603 instruction that initiated the access completes and 
instruction execution resumes. 
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System designers should note the following: 

• “Misplaced” reply operations (that match the processor tag and arrive unexpectedly) 
are ignored by the 603. 

• External logic must asse rt AACK for the 603, even though it is the receiver of the 
reply operation. AACK is an input-only signal to the 603. 

• The 603 monitors address parity when enabled by software and XATS and reply 
operations (load or store). 

C.1 .2.4 Direct-Store Operation Timing 

The following timing diagrams show the sequence of events in a typical 603 direct-store 
load access (Figure C-5) and a typica l 603 direct-store store access (Figure C-6). All 
arbitration signals except for ABB and DBB have been omitted for clarity, although they 
are still required. Note that, for either case, the number of immediate operations depends 
on the amount and the alignment of data to be transferred. If no more than 4 bytes are being 
transferred, and the data is double-word-aligned (that is, does not straddle an 8-byte address 
boundary), there will be no immediate operation as shown in the figures. 

The 603 can transfer as many as 128 bytes of data in one load or store instruction (requiring 
more than 33 immediate operations in the case of misaligned operands). 

In Figure C-5, XATS is asserted with the same timing relationship as TS in a memory 
access. Notice, however, that the address bus (and XATC) transition on the next bus clock 
cycle. The first of the two beats on the address bus is valid fo r one bus clock cycle window 
only, and that window is defined by the assertion of XATS. The second address bus beat, 
however, can be extended by delaying the assertion of AACK until the system has latched 
the address. 

The load request and load reply operations, shown in Figure C-5, are address-only 
transactions as denoted by the negated TT3 signal during their respective address tenures. 
Note that other types of bus operations can occur between the individual direct-store 
operations on the bus. The 603 involved in this transaction, however, does not initiate any 
other direct-store load or store operations once the first direct-store operation has begun 
address tenure; however, if the I/O operation is retried, other higher-priority operations can 
occur. 

Notice that, in this example (zero wait states), 13 bus clock cycles are required to transfer 
no more than 8 bytes of data. 
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I REQUESTOR | IMM. OP 
|1 |2l3|4|5|6 



LAST OP I 
7 I 8 I 9 I 10 



REPLY OP 
11 |12 I 13 




Figure C-5. Direct-Store Interface Load Access Example 

Figure C-6 shows a direct-store store access comprised of three direct-store operations. As 
with the example in Figure C-5, notice that data is transferred only on the 32 bits of the DH 
bus. As opposed to Figure C’5, there is no request operation since the 603 has the data ready 
for the BUC. 



The assertion of the TEA signal during a direct-store operation indicates that an 
unrecoverable error has occurred. If the TEA signal is asserted during a direct-store 
operation, the TEA action will be delayed and the following direct-store transactions will 
continue until all data transfers from the direct store segment had been completed. The bus 
agent that asserts TEA is responsible for asserting the TEA signal for every direct-store 
transaction tenure including the last one. The direct-store reply, in this case, is not required 
and will be ignored by the processor. The processor will take a machine check exception 
after the last direct-store data tenure has been terminated by the assertion of TEA, and not 
before. 



Appendix C. PowerPC 603 Processor System Design and Programming Considerations 



C-11 





I IMM. OP I LAST OP I I REPLY OP | 

|1 |2i3|4|5|6|7|8|9|10| 




Figure C-6. Direct-Store Interface Store Access Example 



C.1.3 CSE Signal 

The 603 employs two-way set associativity for both the instruction and data caches, in 
place of the four-way set associativity of the 603e. The CSE signal indicates which cache 
set is being loaded during a cache line fill. 

Table C-3 shows the CSE signal encoding indicating the cache set selected during a cache 
load operation. 



Table C-3. CSE Signal Encoding 



CSE 


Cache Set Element 


0 


Set 0 


1 


Set 1 



C.1.4 PowerPC 603 Processor Bus Clock Multiplier Configuration 

The 603 provides support for bus clock multipliers of 1:1, 2:1, 3:1, and 4:1. The bus clock 
multipliers are selected through the setting of the PLL_CFG0-PLL_CFG3 signals as 
shown in Table C-4. 
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Table C-4. PowerPC 603 Microprocessor PLL Configuration 



PLL_CFG 

0-3 


Bus, CPU, and PLL Frequencies 


CPU/ 

SYSCLK 

Ratio 


Bus 

16.6 MHz 


Bus 
20 MHz 


Bus 
25 MHz 


Bus 

33.3 MHz 


Bus 
40 MHz 


Bus 
50 MHz 


Bus 

66.6 MHz 


00 00 


1:1 


— 


— 


— 


— 


— 


— 


66.6 

(133) 


0001 


1:1 


— 


— 


— 


33.3 

(133) 


40 

(160) 


50 

(200) 


— 


0010 


1:1 


16.6 

(133) 


20 

(160) 


25 

(200) 


— 


— 


— 


— 


0100 


2:1 


— 


— 


— 


66.6 

(133) 


80 

(160) 


100 

(200) 


Bi 


0101 


2:1 


33.3 

(133) 


40 

(160) 


50 

(200) 


— 


— 


— 


— 


1000 


3:1 


— 


— 


75 

(150) 


100 

(200) 


— 


— 


— 


1001 


3:1 


50 

(200) 


— 


— 


— 


— 


— 


— 


1100 


4:1 


66.6 

(133) 


80 

(160) 


100 

(200) 


— 


— 




— 


0011 


PLL bypass 


1111 


Clock off 



Notes: 1. Some PLL configurations may select bus, CPU, or PLL frequencies which are not useful, not 
supported, or not tested for by the 603. PLL frequencies (shown in parenthesis in Table C-4) 
should not fall below 1 33 MHz, and should not exceed 200 MHz. 

2. In PLL bypass mode, the SYSCLK input signal clocks the Internal processor directly, the PLL is 
disabled, and the bus mode Is set for 1:1 mode operation. This mode is Intended for factory use 
only. Note that the AC timing specifications given in this document do not apply in PLL bypass 
mode. 

3. In clock-off mode, no clocking occurs inside the 603 regardless of the SYSCLK input.4. 
PLL_CFG0-PLL_CFG1 signals select the CPU-to-bus ratio (1:1, 2:1, 3:1, 4:1), 
PLL_CFG2“PLL_CFG3 signals select the CPU-to-PLL multiplier (x2, x4, x8). 

C.1.5 PowerPC 603 Processor Cache Organization 

The 603 provides two 8-Kbyte, two-way set associative caches to allow the registers and 
execution units rapid access to instructions and data. The instruction and data caches are 
configured as 128 sets of two blocks. The operation of the 603 ’s instruction and data caches 
is consistent with the caches in the 603e, with the exception of the reduced cache size and 
set associativity. 
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C.1.5.1 Instruction Cache Organization 

The organization of the instruction cache is shown in Figure C-7. Each cache block 
contains eight contiguous words from memory that are loaded from an eight-word 
boundary (that is, bits A27-A31 of the logical (effective) addresses are zero); as a result, 
cache blocks are aligned with page boundaries. 

Note that address bits A20-A26 provide an index to select a set. Bits A27-A3 1 select a byte 
within a block. The tags consists of bits PA0-PA19. Address translation occurs in parallel, 
such that higher-order bits (the tag bits in the cache) are physical. Note that the replacement 
algorithm is strictly an LRU algorithm; that is, the least recently used block is filled with 
new instructions on a cache miss. 
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Figure C-7. Instruction Cache Organization 



C.1.5.2 Data Cache Organization 

The organization of the data cache is shown in Figure C-8. Each cache block contains eight 
contiguous words from memory that are loaded from an eight-word boundary (that is, bits 
A27-A3 1 of the logical (effective) addresses are zero); as a result, cache blocks are aligned 
with page boundaries. 
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Note that address bits A20-A26 provide an index to select a set. Bits A27-A3 1 select a byte 
within a block. The tags consists of bits PA0-PA19. Address translation occurs in parallel, 
such that higher-order bits (the tag bits in the cache) are physical. Note that the replacement 
algorithm is strictly an LRU algorithm; that is, the least recently used block is filled with 
new data on a cache miss. 
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Figure C-8. Data Cache Organization 

C.2 PowerPC 603 Processor Software 
Considerations 

When developing software for the 603, the programmer should note the following 
differences from the 603e: 

• The 603 supports direct-store accesses; setting T = 1 in a segment register does not 
result in a DSI exception. 

• Store instructions have two-cycle latency and two-cycle throughput. 

• The 603 does not perform integer add or compare instructions in the SRU. 

• The 603 does not implement the key bit (bit 12) in SRRl to provide information 
about memory protection violations prior to page table search operations. 

• HIDl is not implemented by the 603; no read-only access to the PLL_CFG signal 
configuration is provided. 
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The following sections provide further information on the 603 attributes that may affect 
software written for the 603e. 

C.2.1 Diredt-Store interface Address Translation 

With address translation enabled, all memory accesses generated by the 603 map to a 
segment descriptor in the segment table. If T = 1 for the selected segment descriptor and 
there are no BAT hits, the access maps to the direct-store interface, invoking a specific bus 
protocol for accessing some special-purpose I/O devices. Direct-store segments are 
provided for POWER compatibility. As the direct-store interface is present only for 
compatibility with existing I/O devices that used this interface and the direct-store interface 
protocol is not optimized for performance, its use is discouraged. The selection of address 
translation type differs for instruction and data accesses only in that instruction accesses are 
not allowed from direct-store segments; attempting to fetch an instruction from a direct- 
store segment causes an ISI exception .Applications that require low latency load/store 
access to external address space should use memory-mapped I/O, rather than the direct- 
store interface. Refer to Chapter 5, “Memory Management” for additional information 
about address translation and memory accesses. 

C.2.1 .1 Direct-Store Segment Translation Summary Flow 

Figure C-9 shows the flow used by the MMU when direct-store segment address translation 
is selected. In the case of a floating-point load or store operation to a direct-store segment. 
Other implementations may not take an alignment exception, as is allowed by the PowerPC 
architecture. In the case of an eciwx, ecowx, Iwarx, or stwcx. instruction, the 603 sets the 
DSISR register as shown and causes the DSI exception. 
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Direct-Store 
Segment Translation 



Instruction Access 



T=1 




^ ISI Exception J 



eciwx, ecowx, Iwarx, 
or stwcx. instruction 



^ DSISR[5]<-1 

; T ; 

^ DSI Exception ^ 



C 



Floating-Point 
Load or Store 



otherwise 



otherwise 



r — — — * — — — "1 
Alignment Exception 

I I 




otherwise 



Cache Instruction (debt, 
debtst, debt, debi, debst, 
debz, or iebi) 



c 



No-Op 



Perform Direct-Store 
Interface Access 



5 ) 



3 



Optional to the PowerPC architecture. Implemented in the 603. 



Figure C-9. Direct-Store Segment Translation Flow 

A direct-store access occurs when a data access is initiated and SR[T] is set. In the 603, 
MSR[DR] is a don’t care for this case. The following apply for direct-store accesses: 

• Floating-point loads and stores to direct-store segments always cause an alignment 
exception, regardless of operand alignment. 

• Iwarx or stwcx. instructions that map into a direct-store segment always cause a DSI 
exception. However, if the instruction crosses a segment boundary, an alignment 
exception is taken instead. 

C.2.1.2 Direct-Store Interface Accesses 

When the address translation process determines that the segment descriptor has T = 1, 
direct-store interface address translation is selected and no reference is made to the page 
tables and referenced and changed bits are not updated. These accesses are performed as if 
the WIMG bits were ObOlOl; that is, caching is inhibited, the accesses bypass the cache, 
hardware-enforced coherency is not required, and the accesses are considered guarded. 
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The specific protocol invoked to perform these accesses involves the transfer of address and 
data information in packets; however, the PowerPC OEA does not define the exact 
hardware protocol used for direct-store interface accesses. Some instructions cause 
multiple address/data transactions to occur on the bus. In this case, the address for each 
transaction is handled individually with respect to the DMMU. 

The following data is sent by the 603 to the memory controller in the protocol (two packets 
consisting of address-only cycles). 

• Packet 0 

— One of the Kx bits (Ks or Kp) is selected to be the key as follows: 

- For supervisor accesses (MSR[PR] = 0), the Ks bit is used and Kp is ignored. 

- For user accesses (MSR[PR] = 1), the Kp bit is used and Ks is ignored. 

— The contents of bits 3-31 of the segment register, which is the BUID field 
concatenated with the “controller-specific” field. 

• Packet 1 — SR[28-31] concatenated with the 28 lower-order bits of the effective 
address, EA4-EA3 1 . 

C.2.1.3 Direct-Store Segment Protection 

Page-level memory protection as described in Section 5.4.2, “Page Memory Protection,” is 
not provided for direct-store segments. The appropriate key bit (Ks or Kp) from the 
segment descriptor is sent to the memory controller, and the memory controller implements 
any protection required. Frequently, no such mechanism is provided; the fact that a direct- 
store segment is mapped into the address space of a process may be regarded as sufficient 
authority to access the segment. 

C.2.1.4 Instructions Not Supported in Direct-Store Segments 

The following instructions are not supported at all and cause a DSI exception in the 603 
(with DSISR[5] set) when issued with an effective address that selects a segment descriptor 
that has T = 1 (or when MSR[DR] = 0): 

• Iwarx 

• stwcx. 

• edwx 

• ecowx 

C.2.1.5 Instructions with No Effect in Direct-Store Segments 

The following instructions are executed as no-ops by the 603 when issued with an effective 
address that selects a segment where T = 1 : 

• debt 

• debtst 

• debf 

• debi 
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» dcbst 
* dcbz 
» icbi 

C.2.2 Store Instruction Latency 

The store instructions executed by the 603 execute with 2-cycle latency, and 2-cycle 
throughput, in contrast to the 2-cycle latency and 1 -cycle throughput of the 603e. Table C-5 
provides the latencies for the store instructions executed by the 603. 



Table C-5. Store Instruction Timing 



Primary 


Extended 


Mnemonic 


Unit 


Cycles 


31 


151 


stwx 


LSU 


2:2 


31 


183 


stwux 


LSU 


2:2 


31 


215 


stbx 


LSU 


2:2 


31 


247 


stbux 


LSU 


2:2 


31 


407 


sthx 


LSU 


2:2 


31 


438 


ecowx 


LSU 


2:2 


31 


439 


sthux 


LSU 


2:2 


31 


662 


stwbrx 


LSU 


2:2 


31 


663 


stfsx 


LSU 


2:2 


31 


695 


stfsux 


LSU 


2:2 


31 


727 


stfdx 


LSU 


2:2 


31 


918 


sthbrx 


LSU 


2:2 


31 


983 


stfiwx 


LSU 


2:2 


36 


- 


stw 


LSU 


2:2 


37 


- 


stwu 


LSU 


2:2 


38 


... 


stb 


LSU 


2:2 


39 




stbu 


LSU 


2:2 


44 


... 


sth 


LSU 


2:2 


45 


- 


sthu 


LSU 


2:2 


52 


- 


stfs 


LSU 


2:2 


53 


... 


stfsu 


LSU 


2:2 


54 


... 


stfd 


LSU 


2:2 


55 


... 


stfdu 


LSU 


2:2 
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C.2.3 instruction Execution by System Register Unit 

Unlike the 603e, the 603 ’s SRU does not execute integer add and compare instructions. 
Table C-6 lists the instructions executed by the 603 ’s SRU, and the number of cycles 
required for execution. 



Table C-6. System Register Instructions 



Primary 


Extended 


Mnemonic 


Unit 


Cycles 


17 


--1 


sc 


SRU 


3 


19 


050 


rfl 


SRU 


3 


19 


150 


isync 


SRU 


1& 


31 


083 


mfmsr 


SRU 


1 


31 


146 


mtmsr 


SRU 


2 


31 


210 


mtsr 


SRU 


2 


31 


242 


mtsrin 


SRU 


2 


31 


339 


mfspr (not l/DBATs) 


SRU 


1 


31 


339 


mfspr (DBATs) 


SRU 


3& 


31 


339 


mfspr (IBATs) 


SRU 


3& 


31 


467 


mtspr (not IBATs) 


SRU 


2 (XER-&) 


31 


467 


mtspr (IBATs) 


SRU 


2& 


31 


595 


mfsr 


SRU 


3& 


31 


598 


sync 


SRU 


1& 


31 


659 


mfsrin 


SRU 


3& 


31 


854 


eieio 


SRU 


1 


31 


371 


mftb 


SRU 


1 


31 


467 


mttb 


SRU 


1 



Note: Cycle times marked with “&” require a variable number of cycles due to 
serialization. 
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Glossary of Terms and Abbreviations 

The glossary contains an alphabetical list of terms, phrases, and abbreviations used in this 
book. Some of the terms and definitions included in the glossary are reprinted from IEEE 
Std 754-1985, IEEE Standard for Binary Floating-Point Arithmetic, copyright ©1985 by 
the Institute of Electrical and Electronics Engineers, Inc. with the permission of the IEEE. 



^ Atomic. A bus access that attempts to be part of a read- write operation to the 

same address uninterrupted by any other access to that address (the 
term refers to the fact that the transactions are indivisible). The 
PowerPC 603e microprocessor initiates the read and write 
separately, but signals the memory system that it is attempting an 
atomic operation. If the operation fails, status is kept so that the 603e 
can try again. The 603e implements atomic accesses through the 
Iwarx/stwcx. instruction pair. 



B 



Beat. A single state on the 603e bus interface that may extend across multiple 
bus cycles. A 603e transaction can be composed of multiple address 
or data beats. 

Biased exponent. The sum of the exponent and a constant (bias) chosen to 
make the biased exponent's range non-negative. 

Big-endian. A byte-ordering method in memory where the address n of a 
word corresponds to the most significant byte. In an addressed 
memory word, the bytes are ordered (left to right) 0, 1, 2, 3, with 0 
being the most significant byte. 

Boundedly undefined. The results of attempting to execute a given 
instruction are said to ht boundedly undefined if they could have 
been achieved by executing an arbitrary sequence of defined 
instructions, in valid form, starting in the state the machine was in 
before attempting to execute the given instruction. Boundedly 
undefined results for a given instruction may vary between 
implementations, and between execution attempts in the same 
implementation. 
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Branch folding. A technique of removing the branch instruction from the 
instruction sequence. 

Burst. A multiple beat data transfer whose total size is typically equal to a 
cache block (in the 603e, a 32-byte block). 

Bus clock. Clock that causes the bus state transitions. 

Bus master. The owner of the address or data bus; the device that initiates or 
requests the transaction. 



C Cache. High-speed memory containing recently accessed data and/or 

instructions (subset of main memory). 

Cache block. The cacheable unit for a PowerPC processor. The size of a 
cache block may vary among processors. For the 603e, it is one 
cache line (8 words). 

Cache coherency. Caches are coherent if a processor performing a read from 
its cache is supplied with data corresponding to the most recent value 
written to memory or to another processor’s cache. 

Cast-outs. Cache block that must be written to memory when a snoop miss 
causes the least recently used block with modified data to be 
replaced. 

Context synchronization. Context synchronization is the result of specific 
instructions (such as sc or rfi) or when certain events occur (such as 
an exception). During context synchronization, all instructions in 
execution complete past the point where they can produce an 
exception; all instructions in execution complete in the context in 
which they began execution; all subsequent instructions are fetched 
and executed in the new context. 

Copy-back operation. A cache operation in which a cache line is copied 
back to memory to enforce cache coherency. Copy-back operations 
consist of snoop push-out operations and cache cast-out operations. 



D Denormalized number. A nonzero floating-point number whose exponent 

has a reserved value, usually the format's minimum, and whose 
explicit or implicit leading significand bit is zero. 
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Direct-store segment access. An access to an I/O address space. The 603 
defines separate memory-mapped and I/O address spaces, or 
segments, distinguished by the corresponding segment register T bit 
in the address translation logic of the 603. If the T bit is cleared, the 
memory reference is a normal memory-mapped access and can use 
the virtual memory management hardware of the 603. If the T bit is 
set, the memory reference is a direct-store access. 



^ Exception. An unusual or error condition encountered by the processor that 

results in special processing. 

Exception handler. A software routine that executes when an exception 
occurs. Normally, the exception handler corrects the condition that 
caused the exception, or performs some other meaningful task (such 
as aborting the program that caused the exception). The addresses of 
the exception handlers are defined by a two-word exception vector 
that is branched to automatically when an exception occurs. 

Exclusive state. EMI state (E) in which only one caching device contains 
data that is also in system memory. 

Execution synchronization. All instructions in execution are architecturally 
complete before beginning execution (appearing to begin execution) 
of the next instruction. Similar to context synchronization but doesn't 
force the contents of the instruction buffers to be deleted and 
refetched. 

Exponent. The component of a binary floating-point number that normally 
signifies the integer power to which two is raised in determining the 
value of the represented number. Occasionally the exponent is called 
the signed or unbiased exponent. 



F 



Feed-forwarding. A 603e feature that reduces the number of clock cycles 
that an execution unit must wait to use a register. When the source 
register of the current instruction is the same as the destination 
register of the previous instruction, the result of the previous 
instruction is routed to the current instruction at the same time that it 
is written to the register file. With feed-forwarding, the destination 
bus is gated to the waiting execution unit over the appropriate source 
bus, saving the cycles which would be used for the write and read. 

Floating-point unit. The functional unit in the 603e processor responsible 
for executing all floating-point instructions. 
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Flush. An operation that causes a modified cache block to be invalidated and 
the data to be written to memory. 

Fraction. The field of the significand that lies to the right of its implied binary 
point. 



G General-purpose register. Any of the 32 registers in the 603e register file. 

These registers provide the source operands and destination results 
for all 603e data manipulation instructions. Load instructions move 
data from memory to registers, and store instructions move data from 
registers to memory. 



IEEE 754. A standard written by the Institute of Electrical and Electronics 
Engineers that defines operations of binary floating-point arithmetic 
and representations of binary floating-point numbers. 

Instruction queue. A holding place for instructions fetched from the current 
instruction stream. 

Integer unit. The functional unit in the 603e responsible for executing all 
integer instructions. 

Interrupt. An external signal that causes the 603e to suspend current 
execution and take a predefined exception. 

Invalid state. EMI state (I) that indicates that the cache block does not 
contain valid data. 



K Kill. An operation that causes a cache block to be invalidated. 



L Latency. The number of clock cycles necessary to execute an instruction and 

make ready the results of that instruction. 

Little-endian. A byte-ordering method in memory where the address n of a 
word corresponds to the least significant byte. In an addressed 
memory word, the bytes are ordered (left to right) 3, 2, 1, 0, with 3 
being the most significant byte. 

Livelock. A state in which processors interact in a way such that no processor 
makes progress. 



M Mantissa. The decimal part of logarithm. 
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Memory-mapped accesses. Accesses whose addresses use the segmented or 
block address translation mechanisms provided by the MMU and 
that occur externally with the bus protocol defined for memory. 

Memory coherency. Refers to memory agreement between caches and 
system memory (for example, EMI cache coherency). 

Memory consistency. Refers to levels of memory with respect to a single 
processor and system memory (for example, on-chip cache, 
secondary cache, and system memory). 

Memory-forced I/O controller interface access. These accesses are made 
to memory space. They do not use the extensions to the memory 
protocol described for I/O controller interface accesses, and they 
bypass the page- and block-translation and protection mechanisms. 

Memory management unit. The functional unit in the 603e that translates 
the logical address bits to physical address bits. 

Modified state. EMI state (M) in which one, and only one, caching device 
has the valid data for that address. The data at this address in external 
memory is not valid. 



N NaN. An abbreviation for not a number; a symbolic entity encoded in 

floating-point format. There are two types of NaNs — signaling NaNs 
and quiet NaNs. 

No-op. No-operation. A single-cycle operation that does not affect registers 
or generate bus activity. 



o 



Out-of-order. An operation is said to be out-of-order when it is not 
guaranteed to be required by the sequential execution model, such as 
the execution of an instruction that follows another instruction that 
may alter the instruction flow. For example, execution of instructions 
in an unresolved branch is said to be out-of-order, as is the execution 
of an instruction behind another instruction that may yet cause an 
exception. The results of operations that are performed out-of-order 
are not committed to architected resources until it can be ensured that 
these results adhere to the in-order, or sequential execution model. 

Overflow. An error condition that occurs during arithmetic operations when 
the result cannot be stored accurately in the destination register(s). 
For example, if two 32-bit numbers are added, the sum may require 
33 bits due to carry. Since the 32-bit registers of the 603e cannot 
represent this sum, an overflow condition occurs. 
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P Packet. A term used in the 603 with respect to direct store operations. 

Page. A 4-Kbyte area of memory, aligned on a 4-Kbyte boundary. 

Park. The act of allowing a bus master to maintain mastership of the bus 
without having to arbitrate. 

Pipelining. A technique that breaks instruction execution into distinct steps 
so that multiple steps can be performed at the same time. 

Precise exceptions. The pipeline can be stopped so the instructions that 
preceded the faulting instruction can complete, and subsequent 
instructions can be executed following the execution of the 
exception handler. The system is precise unless one of the imprecise 
modes for invoking the floating-point enabled exception is in effect. 



Quiesce. To come to rest. The processor is said to quiesce when an exception 
is taken or a sync instruction is executed. The instruction stream is 
stopped at the decode stage and executing instructions are allowed to 
complete to create a controlled context for instructions that may be 
affected by out-of-order, parallel execution. See Context 
synchronization. 

Quiet NaNs. Propagate through almost every arithmetic operation without 
signaling exceptions. These are used to represent the results of 
certain invalid operations, such as invalid arithmetic operations on 
infinities or on NaNs, when invalid. 



s 



GLO 



Scan interface. The 603e’s test interface. 

Shadowing. Shadowing allows a register to be updated by instructions that 
are executed out of order without destroying machine state 
information. 

Signaling NaNs. Signal the invalid operation exception when they are 
specified as arithmetic operands 

Significand. The component of a binary floating-point number that consists 
of an explicit or implicit leading bit to the left of its implied binary 
point and a fraction field to the right. 

Slave. The device addressed by a master device. The slave is identified in the 
address tenure and is responsible for supplying or latching the 
requested data for the master during the data tenure. 
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Snooping. Monitoring addresses driven by a bus master to detect the need for 
coherency actions. 

Snoop push. Write-backs due to a snoop hit. The block will transition to an 
invalid or exclusive state. 

Split-transaction. A transaction with independent request and response 
tenures. 

Split-transaction Bus. A bus that allows address and data transactions from 
different processors to occur independently. 

Static branch prediction. Mechanism by which software (for example, 
compilers) can give a hint to the machine hardware about the 
direction the branch is likely to take. 

Superscalar machine. A machine that can issue multiple instructions 
concurrently from a conventional linear instruction stream. 

Supervisor mode. The privileged operation state of the 603e. In supervisor 
mode, software can access all control registers and can access the 
supervisor memory space, among other privileged operations. 



X Tenure. The period of bus mastership. For the 603e, there can be separate 

address bus tenures and data bus tenures. A tenure consists of three 
phases: arbitration, transfer, termination 

Transaction. A complete exchange between two bus devices. A transaction 
is minimally comprised of an address tenure; one or more data 
tenures may be involved in the exchange. There are two kinds of 
transactions: address/data and address-only. 

Transfer termination. Signal that refers to both signals that acknowledge 
the transfer of individual beats (of both single-beat transfer and 
individual beats of a burst transfer) and to signals that mark the end 
of the tenure. 



u 



Underflow. An error condition that occurs during arithmetic operations when 
the result cannot be represented accurately in the destination register. 
For example, underflow can happen if two floating-point fractions 
are multiplied and the result is a single-precision number. The result 
may require a larger exponent and/or mantissa than the single- 
precision format makes available. In other words, the result is too 
small to be represented accurately. 
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User mode. The unprivileged operating state of the 603e. In user mode, 
software can only access certain control registers and can only 
access user memory space. No privileged operations can be 
performed. 



w Write- through. A memory update policy in which all processor write cycles 

are written to both the cache and memory. 



GLO 



Glossary-8 



PowerPC 603e RISC Microprocessor User's Manual 





INDEX 



Numerics 

603-specific features, 1-4, C-1 

A 

A0-A31 signals, 7-6 
AAOC signal, 7-14 
ABB signal, 7-5, 8-7 
add, 2-21 
addc, 2-21 
adde, 2-21 
addi, 2-21 
addic, 2-21 
addic., 2-21 
addis, 2-21 
addme, 2-21 
Address bus 

address tenure, 8-6, C-4 
address transfer 
A0-A31, 7-6 
APg-AP3, 7-7 
APE, 7-8, 8-13 
addr^ transfer attribute 
CT, 7-13 

CSEO-CSEl signals, 7-14 
GBL, 7-13 
TBST, 7-12, 8-13 
TCQ-TC1,7-13, 8-20 
TSIZ0-TSIZ2, 7-11,8-13 
TTa-TT4, 7-8, 8-13 
WT, 7-13 

address transfer start 

TS, 7-6, 8-12 

XATS (603-specific), 1-4, C-2, C-3 
address transfer termination 
AACK, 7-14 
ARTRY, 3-20, 7-15 
terminating address transfer, 8-20 
arbitration signals, 7-4, 8-7 
bus arbitra tion 
ABB, 7-5, 8-7 
BG, 7-4, 8-7 
BR, 7-4, 8-7 
bus parking, 8-11 
Address calculation 

branch instructions, 2-34 
effective address, 2-18 
floating-point load and store, 2-33 
integer load and store, 2-28 
Address translation, see Memory management unit 
Addressing conventions 
addressing modes, 2-17 
alignment, 2-12 
addze, 2-21 



Aligned data transfer, 2-12, 8-15, 8-17, 8-19 
Alignment 

data transfers, 2-12, 8-15, 8-17, 8-19 
exception, 4-26, 5-16 
rules, 2-12 
and, 2-23 
andc, 2-23 
andi., 2-22 
andis., 2-22 
APO- AP3 signals, 7-7 
APE signal, 7-8, 8-13 
Arbitration, system bus, 8-9, 8-22 
ARTRY signal, 3-20, 7-15 
Atomic memory references 
stwcx., 2-37 

using Iwarx/stwcx., 3-19 

B 

b, 2-35 

be, 2-35 

beetr, 2-35 

Mr, 2-35 

BG signal, 7-4, 8-7 

Block address translation 

BAT register initialization, 5-20 
block address translation flow, 5-11 
selection of block address translation, 5-8 
Boundedly undefined, definition, 2-16 
BR signal, 7-4, 8-7 
Branch folding, 6-13 
Branch instructions 

address calculation, 2-34 
branch instructions, 2-34, A-24 
condition register logical, 2-35, A-24 
system linkage, 2-41, A-24 
trap, 2-36, A-25 
Branch prediction, 6-1,6-14 
Branch processing unit 

branch instruction timing, 6-16 
execution timing, 6-12 
latency, branch instructions, 6-21 
overview, 1-9 
Branch resolution, 6-1 
Burst data transfers 
32-bit data bus, 8-15 
64-bit data bus, 8-14 
transfers with data delays, timing, 8-36 
Burst transactions, 3-8 
Bus arbitration, see Data bus 
Bus configurations, 8-38, 8-40 
Bus interface unit (BIU), 3-2 
Byte ordering 
default, 2-18 

Byte-reverse instructions, 2-30, A-22 
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Cache 

characteristics, 3-1 
MEI state definition, 3-15 
organization, data, 3-4 
organization, instruction, 3-3 
Cache arbitration, 6-7 
Cache block push operation, 3-7 
Cache block, definition, 3-1 
Cache cast-out operation, 3-7 
Cache coherency 

actions on load operations, 3-18 
actions on store operations, 3-18 
cache control instructions, 3-22 
copy-back operation, 3-11 
in single-processor systems, 3-18 
MEI protocol, 3-14 
overview, 3-2 

reaction to bus operations, 3-19 
WIMG bits, 3-9 
write-back mode, 3-11 
Cache control instructions 
bus operations, 3-25 
dcbf, 2-40, 3-24 
dcbi, 2-43, 3-22 
dcbst, 2-40, 3-24 
debt, 2-40, 3-23 
debtst, 2-40, 3-23 
debz, 2-40, 3-23 
eieio, 2-39, 3-24 
iebi, 2-40, 3-25 
isync, 2-39, 3-25 
purpose, 3-22 
Cache hit, 6-7 

Cache management instructions, 2-40, A-25 
Cache miss, 6-8 
Cache operations 

basic data cache operations, 3-7 
data cache transactions, 3-8 
instruction cache fill operations, 3-3 
overview, 1-13, 3-1 
response to bus transactions, 3-19 
Cache unit 

memory performance, 6-17 
operation of the cache, 8-2 
overview, 3-1 

Cache-inhibited accesses (I bit) 
cache interactions, 3-9 
I-bit setting, 3-11 
timing considerations, 6-18 
Changed (C) bit maintenance 
recording, 5-11, 5-21-5-24 



Checkstop 

signd, 7-23, 8-41 

state, 4-22 

Cl signal, 7-13 
Classes of instructions, 2-15 
Clean block operation, 3-19 
Clock signals 

CLK_OUT,7-29 
PLL_CFG0-PLL_CFG3, 7-29 
SYSCLK, 7-28 
emp, 2-22 
empi, 2-22 
cmpl, 2-22 
empli, 2-22 
cntlzw, 2-23 

Compare instructions, 2-26, A- 1 8 

Completion considerations, 6-9 

Completion, definition, 6-1 

Context synchronization, 2-19 

COP/scan interface, 7-27 

Copy-back mode, 6-18 

CR logical instructions, 2-35 

crand, 2-35 

crandc, 2-35 

creqv, 2-35 

cmand, 2-35 

emor, 2-35 

cror, 2-35 

crorc, 2-36 

crxor, 2-35 

CSEO-CSEl signals, 7-14, 8-30 

D 

Data bus 

32-bit data bus mode, 8-38 
arbitration signals, 7-16, 8-8 
bus arbitration, 8-22 
data tenure, 8-7, C-4 
data transfer, 7-18, 8-24 
data transfer termination, 7-20, 8-25 
Data cache 

basic operations, 3-7 
cache control, 3-5 
configuration, 3-1 
DCFI, DCE, DLOCK bits, 3-5 
organization, 3-5, C-15 
touch load operations, 3-6 
Data cache fill, 3-7 

Data storage interrupt (DSI) see DSI exception 
Data TLB miss on load exception, 4-34 
Data TLB miss on store exception, 4-35 
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Data transfers 

alignment, 2-12, 8-15, 8-17, 8-19 
burst ordering, 8-14 

eciwx and ecowx instructions, alignment, 8-19 

signals, 8-24 

DBB signal, 7-17, 8-8, 8-23 

DBDI S signal, 7-20 

DBG signal, 7-16, 8-8 

DBWO signal, 7-17, 8-8, 8-24, 8-43 

dcbf, 2-40, 3-24 

dcbi, 2-43 

dcbst, 2-40, 3-24 

debt, 2-40, 3-23 

debtst, 2-40, 3-23 

debz, 2-40, 3-23 

DCMP and ICMP registers, 2-9, 5-37 
Decrementer interrupt, 4-31, 9-1 
Defined instruction class, 2-16 
DH0-DH31/DLO-DL31 signals, 7-18 
Direct address translation (translation disabled) 
data accesses, 3-10, 5-9, 5-11, 5-20 
instruction accesses, 3-10, 5-9, 5-11, 5-20 
Direct-store interface (603-specific) 
accesses, C-17 
alignment exception, C-17 
architectural ramifications of accesses, C-2 
bus protocol 

address and data tenures, C-4 
detailed description, C-7 
load access, timing, C-11 
load operations, C-6 
store access, timing, C-12 
store operations, C-5 
transactions, C-4 
XATS, C-3 

instructions with no effect, C-18 
no-op instructions, C-18 
protection, C-18 
segment protection, C-18 
selection of direct-store segments, C-16 
unsupported functions, C-18 
Dispatch considerations, 6-9 
divw, 2-21 
divwu, 2-21 

DMISS and IMISS registers, 2-9, 5-36 
DP0-DP7 signals, 7-19 
DPE signal, 7-20 
DRTRY signal, 7-21, 8-25, 8-28 
DSI exception, 4-23 



E 

eciwx, 2-40 
ecowx, 2-40 

Effective address calculation 
address translation, 5-3 
branches, 2-19, 2-34 
loads and stores, 2-18, 2-28, 2-33 
eieio, 2-39, 3-24 
EMI protocol 

enforcing memory coherency, 8-30 
eqv, 2-23 

Error termination, 8-29 
Exceptions 

alignment exception, 4-26 
data TLB miss on load, 4-34 
data TLB miss on store, 4-35 
decrementer interrupt, 4-31 
DSI exception, 4-23 
enabling and disabling, 4-14 
exception classifications, 4-2 
exception processing, 4-10, 4-15 
external interrupt, 4-25 
FP unavailable exception, 4-31 
instruction address breakpoint, 4-35, 4-36 
instruction related, 2-19 
instruction TLB miss, 4-33 
machine check exception, 4-21 
program exception, 4-29 
register settings 
FPSCR, 4-30 
MSR, 4-17 
SRR0/SRR1,4-11 
reset, 4-18 

returning from an exception handler, 4-16 
summary, 2-19 
system call, 4-31 

system management interrupt, 4-37, 4-37 
trace exception, 4-32 
Execution synchronization, 2-19 
Execution units, 1-9 

External control instructions, 2-40, 8-19, A-26 
extsb, 2-23 
extsh, 2-23 



fabs, 2-27 
fadd, 2-25 
fadds, 2-25 
fempo, 2-26 
fempu, 2-26 
fetiw, 2-26 
fetiwz, 2-26 
fdiv, 2-25 
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fdivs,2-25 

Features, 603e, 1-2, 1-16 
Feed forwarding, 6-5 
Finish cycle, definition, 6-1 
Floating-point model 
FEO/FEl bits, 4-14 
FP arithmetic instructions, 2-25, A- 19 
FP compare instructions, 2-26, A-20 
FP execution models, 2-12 
FP load instructions, 2-33, A-23 
FP move instructions, 2-27, A-24 
FP multiply-add instructions, 2-25, A- 19 
FP rounding/conversion instructions, 2-26, A-20 
FP store instructions, 2-33, A-23 
FP unavailable exception, 4-31 
FPSCR instructions, 2-26, A-20 
fsel instruction, 2-25 
Floating-point unit 

execution timing, 6-16 
latency, FP instructions, 6-25 
overview, 1-10 
Flow control instructions 

branch instruction address calculation, 2-34 
branch instructions, 2-35 
condition register logical, 2-35 
Flush block operation, 3-19 
fmadd, 2-25 
fmadds, 2-25 
fmr, 2-27 
fmsub, 2-25 
fmsubs, 2-25 
fmul, 2-25 
fmuls, 2-25 
fnabs, 2-27 
fneg, 2-27 
fnmadd, 2-25 
fnmadds, 2-25 
fnmsub, 2-25 
fnmsubs, 2-25 
FPR0-FPR31, 2-4 
FPSCR instructions, 2-26 
fres, 2-25 
frsp, 2-26 
frsqrte, 2-25 
fsel, 2-25 
fsub, 2-25 
fsubs, 2-25 

G 

GBL signal, 7-13 
GPR0-GPR31, 2-4 
Guarded memory bit (G bit) 
cache interactions, 3-9 
G-bit setting, 3-12 



H 

HASHl and HASH2 registers, 2-10, 5-37 
Hashing functions 

primary PTEG, 5-32 
secondary PTEG, 5-33 
HIDO register 

bit settings, 2-7 

DCFI, DCE, DLOCK bits, 3-5 

doze bit, 9-3 

doze, nap, sleep, DPM bits, 2-7 
DPM enable bit, 9-2 
ICFI, ICE, ILOCK bits, 3-4 
nap bit, 9-3 
HIDl register 

bit settings, 2-8 
PLL configuration, 2-8, 7-29 
HRESET signal, 7-24 

I 

I/O tenures, C-4 

lABR (instruction address breakpoint register), 2-11 

icbi, 2-40, 3-25 

ICE control bit, 3-4 

ICFI control bit, 3-4 

IEEE 11 49.1 -compliant interface, 8-43 

Illegal instruction class, 2-16 

ILOCK control bit, 3-4 

Instruction address breakpoint exception, 4-35 
Instruction cache 

cache control bits, 3-4 
cache fill operations, 3-3 
configuration, 3-1 
ICFI, ICE, ILOCK bits, 3-4 
organization, 3-3, C-14 
Instruction timing 

execution unit, 6-12 
fetch, 6-7 

instruction flow, 6-5 

memory performance considerations, 6-17 
overview, 6-3 
terminology, 6-1 
timing considerations, 6-4 
Instruction TLB miss exception, 4-33 
Instruction unit, 1-7 
Instructions 

603e-specific instructions, 2-45 
branch address calculation, 2-34 
branch instructions, 2-34, A-24 
cache management instructions, 3-22, A-25 
classes, 2-15 

condition register logical, 2-35, A-24 
defined instructions, 2-16 
external control, 2-40, A-26 
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floating-point 

arithmetic, 2-25, A- 19 
compare, 2-26, A-20 
FP load instructions, 2-33, A-23 
FP move instructions, 2-27, A-24 
FP status and control register, 2-26 
FP store instructions, 2-33, A-23 
FPSCR instructions, A-20 
multiply-add, 2-25, A- 19 
rounding and conversion, 2-26, A-20 
icbi, 2-40, 3-25 
illegal instructions, 2-16 
instructions not implemented in 603e, B-1 
integer 

arithmetic, 2-21, A- 17 
compare, 2-22, A- 18 
load, A-21 
logical, 2-22, A-18 
multiple, 2-31, A-22 
rotate and shift, 2-23, A-18-A-19 
store, 2-29, 2-29, A-21 
isync, 2-39, 3-25, 4-16 
latency summary, 6-21 
load and store 

address generation, floating-point, 2-33 
address generation, integer, 2-28 
byte-reverse instructions, 2-30, A-22 
integer load, 2-28 

integer multiple instructions, 2-31, A-22 
integer store, 2-29 
string instructions, 2-32, A-22 
memory control, 2-39, 2-42, A-25, A-26 
memory synchronization, 2-37, 2-39, A-22 
PowerPC instructions, list 
form (format), A-27 
function. A- 17 
legend, A-38 
mnemonic, A-1 
opcode, A-9 

processor control, 2-36, 2-38, 2-41, A-25 
reserved instructions, 2-17 
rfi, 2-41,4-16 

segment register manipulation, 2-43, A-25 

simplified mnemonics, 2-45 

stwcx., 2-37, 2-38, 4-16 

supervisor-level cache management, 2-43 

support for Iwarx/stwcx., 8-42 

sync, 2-38, 4-16 

system linkage, 2-41, A-24 

TLB management instructions, 2-43, A-26 

tlbld, 2-44, 2-46 

tlbli, 2-44, 2-47 

trap instructions, 2-36, A-25 



INT signal, 7-22, 8-41 

Integer arithmetic instructions, 2-21, A- 17 

Integer compare instructions, 2-22, A-18 

Integer load instructions, 2-28, A-21 

Integer logical instructions, 2-22, A-18 

Integer multiple instructions, 2-31, A-22 

Integer rotate and shift instructions, 2-23, A-18-A-19 

Integer store instructions, 2-29, A-21 

Integer unit 

execution timing, 6-16 
latency, integer instructions, 6-23 
overview, 1-9 
Interrupt see Exceptions 
Interrupt, external, 4-25 
isync, 2-39, 3-25, 4-16 

K 

Kill block operation, 3-19 

L 

Latency, 6-1, 6-3, 6-21, 8-25 

Ibz, 2-29 

Ibzu, 2-29 

Ibzux, 2-29 

Ibzx, 2-29 

Ifd, 2-33 

Ifdu, 2-33 

Ifdux, 2-33 

Ifdx, 2-33 

Ifs, 2-33 

Ifsu, 2-33 

Ifsux, 2-33 

Ifsx, 2-33 

lha, 2-29 

lhau, 2-29 

lhaux, 2-29 

lhax, 2-29 

Ihbrx, 2-30 

Ihz, 2-29 

Ihzu, 2-29 

Ihzux, 2-29 

Ihzx, 2-29 

Imw, 2-31 

Load operations 

I/O load accesses, C-6 
memory coherency actions, 3-18 
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Load/store 

address generation, 2-28, 2-33 
byte-reverse instructions, 2-30, A-22 
floating-point load instructions, 2-33, A-23 
floating-point move instructions, 2-27, A-24 
floating-point store instructions, 2-33, A-23 
integer load instructions, 2-28, A-21 
integer store instructions, 2-29, A-21 
load/store multiple instructions, 2-31, A-22 
memory synchronization instructions, 2-37, 
2-39, A-22 

string instructions, 2-32, A-22 
Load/store unit 

execution timing, 6-17 
latency, load and store instructions, 6-26 
Logical addresses 

translation into physical addresses, 5-1 
Iswi, 2-32 
Iswx, 2-32 
Iwarx, 2-38, 3-19 
Iwarx/stwcx. 

general information, 3-19 
support, 8-42 
Iwbrx, 2-30 
Iwz, 2-29 
Iwzu, 2-29 
Iwzux, 2-29 
Iwzx, 2-29 

M 

Machine check exception 
checkstop state, 4-22 
register settings, 4-21 
SRRl bit settings, 4-11 
machine check exception enabled, 4-21 
signal, 7-23 
mcrf, 2-36 
mcrfs, 2-27 
mcrxr, 2-36 
MEI protocol 

definition, MEI states, 3-15 
hardware considerations, 3-16 
Memory accesses, 8-4 
Memory coherency bit (M bit) 
cache interactions, 3-9 
I-bit setting, 3-1 1 
M-bit setting, 3-1 1 
timing considerations, 6-18 
Memory control instructions 

segment register manipulation, 2-43 
supervisor-level cache management, 2-43 
TLB management, 2-43 
user-level cache, 2-40 



Memory management unit 

address translation flow, 5-11 
address translation mechanisms, 5-8, 5-11 
block address translation, 5-8, 5-11, 5-20 
block diagram, 5-5-5-7 

direct address translation, 3-10, 5-9, 5-11, 5-20 

exceptions, 5-14 

features summary, 5-2 

instructions and registers, 5-17 

memory protection, 5-10 

overview, 1-11 

page address translation, 5-8, 5-11, 5-28 
page history status, 5-11, 5-21-5-25 
page table search operation, 5-30 
segment model, 5-21 

software table search operation, 5-33, 5-38, 5-40 
Memory synchronization 
eieio, 2-39 

instructions, 2-37, 2-39, A-22 

isync, 2-39 

Iwarx, 2-38 

stwcx., 2-37, 2-38 

sync, 2-38 

Memory/cache access modes see also WIMG bits 
performance impact of copy-back mode, 6-18 
mfcr, 2-36 
mffs, 2-27 
mfmsr, 2-41 
mfspr, 2-41 
mfsr, 2-43 
mfsrin, 2-43 
mftb, 2-39 

Misaligned accesses, 2-12 
Misaligned data transfer, 8-17,8-19 
Move instructions, 2-27 
MSR (machine state register) 
bit settings, 4-12 
DR/IR bit, 4-13 
EE bit, 4-12 
FEO/FEl bits, 4-14 
POW bit, 2-5, 4-12 
RI bit, 4-15 

settings due to exception, 4-17 
TGPR bit, 2-5, 4-12 
mtcrf, 2-36 
mtfsbO, 2-27 
mtfsbl,2-27 
mtfsf, 2-27 
mtfsfi, 2-27 
mtmsr, 2-41 
mtspr, 2-41 
mtsr, 2-43 
mtsrin, 2-43 
mulhw, 2-21 
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mulhwu, 2-21 
mulli, 2-21 
mullw, 2-21 

N 

nand, 2-23 
neg, 2-21 

No-DRTRY mode, 8-40 
Nondenormalized mode, support, 2-24 
nor, 2-23 

o 

operand placement and performance, 2-14 
Operating environment architecture (OEA), xxvi, 
1-15, 2-41 

Optional instructions, A-38 

or, 2-23 

ore, 2-23 

ori, 2-22 

oris, 2-23 

P 

Page address translation 

page address translation flow, 5-28 
page size, 5-21 

selection of page address translation, 5-8, 5-14 
table search operation, 5-30 
TLB organization, 5-26 
Page history status 

R and C bit recording, 5-1 1, 5-21-5-25 
Page tables 

page table updates, 5-50 
resources for table search operations, 5-34 
software table search operation, 5-33, 5-38 
table search for PTE, 5-30 
Performance considerations, memory, 6-17 
Phase locked loop, 9-3 
Physical address generation 

memory management unit, 5-1 
Pipeline 

instruction timing, definition, 6-2 
Pipelined execution unit, 6-3 
PLL configuration, 7-30 
Power management 
doze mode, 9-3 

doze, nap, sleep, DPM bits, 2-7, 2-8 
full-power mode, 9-2 
nap mode, 9-3 

programmable power modes, 9-2 
sleep mode, 9-4 
software considerations, 9-4 
PowerPC 603-specific features, 1-4, C-1 



PowerPC architecture 

features used in the 603e, 1-16 
instruction list, A-1, A-9, A- 17 
levels of implementation, 1-15 
operating environment architecture (OEA), xxvi, 
1-15, 2-41 

user instruction set architecture (UISA), xxv, 

1-15, 2-2 

virtual environment architecture (VEA), xxv, 
1-15, 2-38 
Privilege levels 

supervisor-level cache instruction, 2-43 
Privileged state see Supervisor mode 
Problem state see User mode 
Process switching, 4-16 

Processor control instructions, 2-36, 2-38, 2-41, A-25 
Program exception, 4-29 
Program order, 6-2 
Programmable power states 
doze mode, 9-3 
full-power mode 

DPM enabled/disabled, 9-2 
nap mode, 9-3 
sleep mode, 9-4 
Protection of memory areas 

direct-store interface protection (603-specific), 
C-18 

no-execute protection, 5-12 
options available, 5-10 
protection violations, 5-14 
PTEGs (PTE groups) 

table search operation, 5-30 
PTEs (page table entries), 5-30 

Q 

QACK signal, 7-25, 8-38, 8-41 
QREQ signal, 7-25, 8-42 
Qualified bus grant, 8-7 
Qualified data bus grant, 8-23 

R 

Read atomic operation, 3-19 
Read operation, 3-19 

Read with intent to modify operation, 3-19 
Real address (RA) see Physical address generation 
Real addressing mode see Direct address translation 
(translation disabled) 

Reduced-pinout mode, 8-40 
Referenced (R) bit maintenance 
recording, 5-11, 5-21-5-24, 5-31 
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Registers 

configuration registers 
MSR, 2-5 
PVR. 2-5 

exception handling registers 
DAR, 2-6 
DSISR, 2-6 
SPRGO-SPRG3, 2-6 
SRRO, 2-6 
SRRl.2-6 

implementation-specific registers, 2-7 
memory management registers 
BAT registers, 2-5 
SDRl.2-6 
SR, 2-6 

supervisor-level 

BAT registers, 2-5 
DAR, 2-6 

DCMP and ICMP, 2-9, 5-37 
DEC, 2-6 

DMISS and IMISS, 2-9, 5-36 
DSISR, 2-6 
EAR, 2-6 

HASHl and HASH2, 2-10, 5-37 
HID0andHIDl,2-7 
lABR, 2-11 
MSR, 2-5 
PVR, 2-5 
RPA, 2-10 
SDRl.2-6 
SPRG0-SPRG3, 2-6 
SR, 2-6 
SRRO, 2-6 
SRRl.2-6 
time base (TB), 2-6 
user-level 
CR, 2-4 
CTR, 2-4 
FPR0-FPR31, 2-4 
FPSCR, 2-4 
GPR0-GPR31, 2-4 
LR, 2-4 

time base (TB), 2-4 
XER, 2-4 
Rename buffer, 6-2 
Rename register operation, 6- 1 1 
Reservation station, 6-2 
Reserved instruction class, 2-17 
Reset 

hard reset, 4-19 
HRESET signal, 7-24, 8-41 
reset exception, 4-18 
SRESET signal, 7-25, 8-41 
rfi, 2-41,4-16 
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rlwimi, 2-23 
rlwinm, 2-23 
rlwnm, 2-23 

Rotate and shift instructions, 2-23, A-18-A-19 
RPA (required physical address) 
bit settings, 2-11, 5-38 
RSRV signal, 7-26, 8-42 

s 

sc, 2-41 

Segment registers 

SR manipulation instructions, 2-43, A-25 
Tbit,C-2 

Segmented memory model, see Memory management 
unit 

Self-modifying code, 2-28 
Serializing instructions, 6-1 1 
Signals 

AO-A3 1,7-6 

MC K, 7-14 

ABB, 7-5, 8-7 

address arbitration, 7-4, 8-7 

address transfer, 8-12 

address transfer attribute, 8-13 

APO-AP3, 7-7 

APE, 7-8 

^TRY, 7-15, 8-25 
BG, 7-4, 8-7 
BR, 7-4, 8-7 

Cache set entry signal, 8-30 
checkstop, 8-41 
a, 7-13 

CKSTPJN, 7-23 
CKSTPLOUt, 7-24 
CLK^OUT, 7-29 
configuration, 7-3 
COP/scan interface, 7-27 
CSEO-CSEl, 7-14 
data arbitration, 8-8, 8-22 
data transfer termination, 8-25 
DBB, 7-17, 8-8, 8-23 
DBDIS, 7-20 
DM, 7-16, 8-8 
DBWO, 7-17, 8-8, 8-24, 8-43 
DHO-DH31/DLO-DL31, 7-18 
DP0-DP7, 7-19 
DPE, 7-20 

DRTRY, 7-21, 8-25, 8-28 
GBL, 7-13 
HRESET, 7-24 
INT, 7-22, 8-41 
MCP, 7-23 

PLL_CFG0-PLL_CFG3, 7-29 
QACK, 7-25, 8-38, 8-41 
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QREQ, 7-25, 8-42 
reset, 8-41 
RSRV, 7-26, 8-42 
SMI, 4-37, 7-23 
SRESET, 7-25, 8-41 
system quiesce control, 8-42 
TA, 7-20 
TBEN, 7-26 
TEST, 7-12, 8-25 
TCO-TCl, 7-13, 8-20 
TEA, 7-22, 8-25, 8-29 
TLBISYNC, 7-26 
TS, 7-6 

TSIZO-TSIZ2, 7-11, 8-13 
TT0-TT4, 7-8, 8-13 
WT, 7-13 

XATS (603-specific), 1-4, C-2, C-3 
Single-beat reads with data delays, timing, 8-35 
Single-beat transactions, 3-8 
Single-beat transfer 

reads with data delays, timing, 8-34 
reads, timing, 8-32 
termination, 8-26 
writes, timing, 8-33 
slw, 2-24 

SMI signal, 4-37, 7-23 
Snoop operation, 3-19, 6-18 
Split-bus transaction, 8-8 
SPR encodings 

not implemented in 603e, B-3 
sraw, 2-24 
srawi, 2-24 
SRESET signal, 7-25 
SRRO/SRRl (status save/restore registers) 

bit settings for machine check exception, 4-11 
bit settings for table search operations, 4-11 
srw, 2-24 
Stall, 6-2 

Static branch prediction, 6-14 

stb, 2-30 

stbu, 2-30 

stbux, 2-30 

stbx, 2-30 

stfd, 2-34 

stfdu, 2-34 

stfdux, 2-34 

stfdx, 2-34 

stfiwx, 2-34 

stfs, 2-34 

stfsu, 2-34 

stfsux, 2-34 

stfsx, 2-34 

sth, 2-30 

sthbrx, 2-30 

sthu, 2-30 



sthux, 2-30 
sthx, 2-30 
stmw, 2-31 
Store operations 

I/O operations to BUG, C-5 
memory coherency actions, 3-18 
single-beat writes, 8-33 
String instructions, 2-32, A-22 
stswi, 2-32 
stswx, 2-32 
stw, 2-30 
stwbrx, 2-30 
stwcx., 2-37, 2-38, 4-16 
stwu, 2-30 
stwux, 2-30 
stwx, 2-30 
subf, 2-21 
subfc, 2-21 
subfe, 2-21 
subfic, 2-21 
subfme, 2-21 
subfze, 2-21 
Superscalar, 6-2 

Supervisor mode see Privilege levels 
Supervisor-level registers 
list of, 2-5 

sync, 2-38, 3-19, 4-16 
Synchronization 

context/execution synchronization, 2-19 
execution of rfi, 4-16 

memory synchronization instructions, 2-37, 
2-39, A-22 
SYSCLK signal, 7-28 
System call exception, 4-31 
System linkage instructions, 2-41, A-24 
System management interrupt, 4-37, 4-37, 9-1 
System quiesce control signals 
QACK and QREQ, 8-42 
System register unit 

execution timing, 6-17 
latency, CR logical instructions, 6-22 
latency, system register instructions, 6-22, C-20 
System status 

CKSTPJN, 7-23 
CKSTP.OUT, 7-24 
HRESET, 7-24 
INT, 7-22 
MCP, 7-23 
QACK, 7-25 
QREQ, 7-25 
RSRV, 7-26 
SMI, 7-23 
SRESET, 7-25 
TBEN, 7-26 
TLBISYNC, 7-26 



Index 



Index-9 





INDEX 



T 

TA signal, 7-20 
Table search operations 
algorithm, 5-30 

software routines for the 603e, 5-33, 5-38-5-49 
SRRl bit settings, 4-11 

table search flow (primary and secondary), 5-31 
THEN signal, 7-26 
TEST signal, 7-12, 8-13,8-25 
TCO-TCl signals, 7-13, 8-20 
TEA signal, 7-22, 8-29 
Termination, 8-20, 8-25 
Throughput, 6-2 
Timing diagrams, interface 

address transfer signals, 8-12 
burst transfers with data delays, 8-36 
direct-store interface load access, C- 1 1 
direct-store interface store access, C-12 
single-beat reads, 8-32 
single-beat reads with data delays, 8-34 
single-beat writes, 8-33 
single -beat writes with data delays, 8-35 
use of TEA, 8-37 
using DBWO, 8-43 
Timing, instruction 

BPU execution timing, 6-12 

branch timing example, 6-16 

cache arbitration, 6-7 

cache hit, 6-7 

cache miss, 6-8 

FPU execution timing, 6-16 

instruction dispatch, 6-9 

instruction fetch timing, 6-7 

instruction flow, 6-5 

instruction scheduling guidelines, 6-19 

lU execution timing, 6-16 

latency summary, 6-21 

load/store unit execution timing, 6-17 

overview, 6-3 

SRU execution timing, 6-17 
stage, definition, 6-2 
TLB 

description, 5-25 

invalidate (tlbie instruction), 5-27, 5-50 
TLB invalidate 

TLB management instructions, A-26 
tlbie, 2-44 

TLBISYNC signal, 7-26 

tlbld, 2-44, 2-46 

tlbli, 2-44, 2-47 

tlbsync, 2-44 

Trace exception, 4-32 

Transactions, data cache, 3-8 

Transfer, 8-11, 8-24 



Trap instructions, 2-36 
TS signal, 7-6, 8-12 
TSIZ0-TSIZ2 signals, 7-11, 8-13 
TTO-TT4 signals, 7-8, 8-13 
tw, 2-36 
twi, 2-36 

u 

Use of TEA, timing, 8-37 
User mode, 2-40 

User instruction set architecture (UISA), xxv, 
1-15, 2-2 

User-level registers 
li st of, 2-4 , 2-5 
Using DBWO, timing, 8-43 

V 

Virtual environment architecture (VEA), xxv, 
1-15, 2-38 

w 

WIMG bits, 3-9, 8-30 
Write-with-atomic operation, 3-19 
Write- with-flush operation, 3-19 
Write-with-kill operation, 3-19 
Write-back, 6-2 
Write-back mode, 3-11 
Write-through mode (W bit) 
cache interactions, 3-9 
timing considerations, 6-18 

W-bit setting, 3-10 

WT signal, 7-13 

X 

XATS (603-specific), 1-4, C-2, C-3 
xor, 2-23 
xori, 2-23 
xoris, 2-23 
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IOWA, Cedar Rapids .... (319)378-0383 
KANSAS, Kansas City/ 

Mission (913)451-8555 

MARYLAND, Columbia .(410)381-1570 
MASSACHUSETTS, 

Marlborough (508)481-8100 

MASSACHUSETTS, 

Woburn (617)932-9700 

MICHIGAN, Detroit (810)347-6800 

Literature . . (800)392-2016 
MINNESOTA, 

Minnetonka (612)932-1500 

MISSOURI, St. Louis . . . (314)275-7380 
NEW JERSEY, Fairfield . . (201)808-2400 
NEW YORK, Fairport . . . (716)425-4000 
NEW YORK, Hauppauge . (516)361-7000 
NEW YORK, FishkIII .... (914)896-0511 
NORTH CAROLINA, 

Raleigh (919)870-4355 

OHIO, Cleveland (216)349-3100 



OHIO, Columbus/ 

Worthington (614)431-8492 

OHIO, Dayton (513)438-6800 

OKLAHOMA, Tulsa .... (918)459-4565 
OREGON, Portland (503)641-3681 

PENNSYLVANIA, 

Colmar (215)997-1020 

Philadelphia/Horsham . (215)957-4100 
TENNESSEE, Knoxville . . (615)584-4841 

TEXAS, Austin (512)502-2100 

TEXAS, Houston (713)783-6400 

TEXAS, Plano (214)516-5100 

VIRGINIA, Richmond . . . (804)285-2100 

UTAH, CSI Inc. (801)572-4010 

WASHINGTON, 

Bellevue (206)454-4160 

Seattle Access (206)622-9960 

WISCONSIN, Milwaukee/ 

Brookfield (414)792-0122 

Field Applications Engineering Available 
Through All Sales Offices 

CANADA 
BRITISH COLUMBIA, 

Vancouver (604)293-7650 

ONTARIO, Toronto (416)497-8181 

ONTARIO, Ottawa (613)226-3491 

QUEBEC, Montreal (514)333-3300 

INTERNATIONAL 

AUSTRALIA, 

Melbourne (61-3)98870711 

AUSTRALIA, Sydney . . . (61-2)9661071 
BRAZIL, Sao Paulo .... 55(11)815-4200 

CHINA, Beijing 86-505-2180 

FINLAND, Helsinki ... 358-0-351 61191 

earphone 358(49)211501 

FRANCE, Paris 33134 635900 

GERMANY, LangenhagerV 

Hannover 49(511)786880 

GERMANY, Munich 49 89 92103-0 

GERMANY, 

Nuremberg 49 911 96-3190 

GERMANY, 

Sindelfingen 49 7031 79 710 

GERMANY, Wiesbaden . . 49 611 973050 

HONG KONG, 

Kwal Fong 852-6106888 

Tai Po 852-6668333 

INDIA, Bangalore .... 91-80-5594754 
ISRAEL, Herziia 972-9-590222 



ITALY, Milan 39(2)82201 

JAPAN, Fukuoka 81-92-725-7583 

JAPAN, Gotanda 81-3-5487-8311 

JAPAN, Nagoya 81-52-232-3500 

JAPAN, Osaka 81-6-305-1802 

JAPAN, Sendai 81-22-268-4333 

JAPAN, Takamatsu . . . 81-878-37-9972 

JAPAN, Tokyo 81-3-3440-3311 

KOREA, Pusan 82(51)4635-035 

KOREA, Seoul 82(2)554-5118 

MALAYSIA, Penang 60(4)374514 

MEXICO, Mexico City . . . 52(5)282-0230 
MEXICO, Guadalajara . . 52(36)21-8977 

Marketing 52(36)21-2023 

Customer Sen^ice . . . 52(36)669-9160 
NETHERLANDS, 

Best (31)4998 612 11 

PUERTO RICO, 

San Juan (809)793-2170 

SINGAPORE (65)4818188 

SPAIN, Madrid 34(1)457-8204 

or 34(1)457-8254 

SWEDEN, Solna 46(8)734-8800 

SWITZERLAND, 

Geneva 41(22)79911 11 

SWFTZERLAND, Zurich . . 41(1)730-4074 

TAIWAN, Taipei 886(2)717-7089 

THAILAND, Bangkok . . . 66(2)254-4910 
UNITED KINGDOM, 

Aylesbury 44 1 (296)395-252 

FULL LINE REPRESENTATIVES 
CALIFORNIA, Loomis 
Galena Techrrology 

Group (916)652-0268 

NEVADA, Reno 

Galena Tech. Group . . . (702)746-0642 
NEW MEXICO, Albuquerque 
S&S Technologies, Inc. . (602)414-1100 
UTAH, Salt Lake City 
Utah Comp. Sales, Inc. . (801)561-5099 
WASHINGTON, Spokane 
Doug Kenley (509)924-2322 

HYBRID/MCM COMPONENT 
SUPPLIERS 

Chip Supply (407)298-7100 

Elmo Semiconductor (818)768-7400 

Minco Technology 

Labs Inc (512)834-2022 

Semi Dice Inc (310)594-4631 
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International Motorola Distributor and Sales Offices 



AUTHORIZED DISTRIBUTORS 



AUSTRALIA 

Veltek Australia Ry Ltd (61)3 808-7511 

VSI Electronics (Australia) . . . (61)2 878-1299 

AUSTRIA 

EBV Austria (43) 222 894 1774 

ElbatexGmbH (43)222 86 3211 

Spoerle Austria (43) 222 31872700 

BELGIUM 

Diode Spoerle (32) 2 725 4660 

EBV Belgium (32)2 716 0010 

CHINA 

Advanced Electronics Ltd. . . . (852)2305-3633 
China El. App. Corp. 

Xiamen Co (86)592 513-2489 

Nanco Electronics 

Supply Ltd (852) 2 333-5121 

Qing Cheng 

Enterprises Ltd (852) 2 493-4202 

WKK-China (852)2 357-8888 

DENMARK 

Avnet Nortec A/s (45)44 880800 

EBV Denmark (45)39690511 

FINLAND 

Arrow Field OY (35) 807 775 71 

Avnet Nortec OY (35)806 13181 

FRANCE 

Arrow Electronique (33) 1 49 78 49 78 

Avnet Components (33) 1 49 65 25 00 

EBV France (33) 1 64 68 86 00 

Future Electronics (33)1 69821111 

Newark (33)1-30954060 

SEI/Scaib (33) 1 69 19 89 00 

GERMANY 

Avnet E2000 (49) 89 4511001 

EBV Germany (49)89 456100 

Future Electronics GmbH . . . (49) 89-957 270 

JermynGmbH (49)6431-5080 

Newark (49)2154-70011 

SascoGmbH (49)89-46110 

Spoerle Electronic (49)6103-304-0 

HOLLAND 

EBV Holland (31) 3465 623 53 

Diode Spoerle BV (31) 340 29 1234 

HONG KONG 

Nanshing CIr. & 

Chem. Co. Ltd (852)2 333-5121 

Wong’s Kong King 

Semi. Ltd (852)2 357-8888 

INDIA 

Canyon Products Ltd (91) 755-2583 



INDONESIA 

P.T. Ometraco (62)2230-7032 

ITALY 

Avnet Adelsy SpA (39)2 38103100 

EBV Italy (39) 2 660961 

Silverstar SpA (39) 2 66 12 51 

JAPAN 

AMSC Co.. Ltd 81-422-54-6800 

Marubun Corporation 81-3-3639-8951 

OMRON Corporation 81-3-3779-9053 

Fuji Electronics Co., Ltd 81-3-3814-1411 

Tokyo Electron Ltd 81-3-5561-7254 

Nippon Motorola Micro Elec. . 81-3-3280-7300 

KOREA 

Lite-On Korea Ltd (82)2858-3853 

Nasco Co. Ltd (82)23772-6800 

Jung Kwang Sa (82)2278-5333 

NEW ZEALAND 

VSI Electronics (NZ) Ltd .... (64)9 579-6603 

NORWAY 

Avnet Nortec A/S Norway . . . (47) 66 846210 
Arrow Tahonic A/S (47)2237 8440 

PHILIPPINES 

Alexan Commercial . . (63)2241-9493 or 9491 

SINGAPORE 

GEIC (65)298-7633 

Draco Impex Asia Re Ltd (65) 545-7811 

Strong Re. Ltd (65) 276-3996 

SPAIN 

Amitron Arrow (34) 1 304 30 40 

EBV Spain (34) 1 358 86 08 

Selco S.A (34) 1 637 1011 

SWEDEN 

Avnet Nortec AB (48) 8 629 1 4 00 

Arrow-Th;s (48)8 362970 

SWITZERLAND 

EBV Switzerland (41) 1 7456161 

Elbatex AG (41) 56 275 165 

Spoerle (41) 1 8746262 

THAILAND 

Shapiphat Ltd (66)2222-9937 

or 2224-6767 

TAIWAN 

Mercuries & Assoc. Ltd (886)2 503-1111 

Solomon Technology Corp. . (886)2788-8989 
Strong Electronics Co. Ltd. . (886)2 917-9917 

UNITED KINGDOM 

Arrow Electronics (UK) Ltd . (44) 1 234 270027 

Avnet/Access (44) 1 462 480888 

Future Electronics Ltd (44) 1 753 687000 



Macro Marketing Ltd (44) 1 628 60600 

Newark (44) 1 420 543333 

CANADA 

All Provinces - Newark (800)463-9275 

ALBERTA 

Calgary 

Electro Sonic Inc (403)255-9550 

Future Electronics (403)250-5550 

Hamilton/Hallmark (800)663-5500 

Edmonton 

Future Electronics (403)438-2858 

Hamilton/Hallmark (800)663-5500 

Saskatchewan 

Hamilton/Hallmark (800)663-5500 

BRITISH COLUMBIA 

Vancouver 

Arrow Electronics (604)421-2333 

Electro Sonic Inc (604)273-2911 

Future Electronics (604)294-1166 

Hamilton/Avnet Electronics . (604)420-4101 

MANITOBA 

Winnipeg 

Electro Sonic Inc (209)783-3105 

Future Electronics (204)944-1446 

Hamilton/Hallmark (800)66^-5500 

ONTARIO 

Ottawa 

Arrow Electronics (613)226-6903 

Electro Sonic Inc (613)728-8333 

Future Electronics (613)820-8313 

Hamilton/Hallmark (613)226-1700 

Toronto 

Arrow Electronics (416)670-7769 

Electro Sonic Inc (416)494-1666 

Future Electronics (905)612-9200 

Hamilton/Hallmark (905)564^060 

Newark (519)685-4280 

(905)670-2888 

Richardson Electronics (905)795-6300 

FAI (905)612-9888 

QUEBEC 

Montreal 

Arrow Electronics (514)421-7411 

Future Electronics (514)694-7710 

Hamilton/Hallmark (514)335-1000 

Richardson (514)748-1770 

Quebec City 

Arrow Electronics (418)687-4231 

Future Electronics (418)682-8092 

St. Laurent 

Richardson Electronics (514)748-1770 
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IBM 

MICROELECTRONICS 
PowerPC Marketing 

Mail stop A25/862-1 

1000 River Street 

Essex Junction, VT 05452-4299 

Tel: (800) PowerPC 

Tel: (800) 769-3772 

Fax: (800) POWERfax 

Fax: (800)769-3732 



IBM 

MICROELECTRONICS 

MANUFACTURERS 

REPRESENTATIVES 

Bonser-Philhower 

Sales 

689 West Renner Road 
Suite 101 

Richardson, TX 75080 
Tel: (214) 234-8438 
Fax: (214)437-0897 

8240 MoPac Expressway 
Suite 135 
Austin, TX 78759 
Tel: (512) 346-9186 
Fax: (512) 346-2393 

10700 Richmond 
Suite 150 

Houston, TX 77042 
Tel: (713) 782-4144 
Fax: (713) 789-3072 



Centaur Corporation 

18006 Sky Park Circle 
Suite 106 
Irvine, CA 92714 
Tel: (714)261-2123 
Fax: (714) 261-2905 

3914 Murphy Canyon Road 
#A125 

San Diego, CA 92123 
Tel: (619)278-4950 
Fax: (619)278-0649 

23901 Calabasas Road 
Suite 1063 

Calabasas, CA 91302 
Tel: (818)591-1655 
Fax: (818) 591-7479 

Mlll-Bern Associates 

2 Mack Road 
Woburn, MA01801 
Tel: (617) 932-3311 
Fax: (617) 932-0511 

Nexus 

555 N. Mathilda Avenue 
Suite 120 

Sunnyvale, CA 94086 
Tel: (408) 720-4787 
Fax: (408) 720-4453 



S-J Associates 

265 Sunrise Highway 
Rockville Centre, NY 11570 
Tel: (516) 536-4242 
Fax: (516)536-9638 

3547 West Lake Road 
Canandaigua, NY 14424 
Tel: (716) 394-3281 
Fax: (716)394-1139 

131-D Gaither Drive 
ML Laurel, NJ 08054 
Tel: (609) 866-1234 
Fax:(609)866-8627 

10 Cooper Ridge Circle 
Guilford, CN 06437 
Tel: (203)458-7558 
Fax: (203) 458-1181 

900 S. Washington Street 
Suite B-2 

Falls Church, VA 22046 
Tel: (703)533-2233 
Fax: (703)533-2236 
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IBM 

MICROELECTRONICS 

DISTRIBUTORS 

Bell Industries, Inc. 
Electronic Distribution 
Group 

11812 San VicentaBlvd 
Suite 300 

Los Angeles, CA 90049 
Fax:447-3265 
Tel: (800) BUY-BELL 

*FAE in these locations. 

Arizona 

* Phoenix (SW) 

140 S. Lindon Lane #102 
Tempe, AZ 85281 
Tel: (602) 966-3600 
Fax:(602)967-6584 

California 

* Ventura (SW) 

30101 Agoura CL, #118 
Agoura Hills, CA 91301 
Tel: (818) 879-9492 
Fax: (818) 991-7695 

* Orange County (SW) 

220 Technology Drive, 

Suite 100 

Irvine, CA 92718 
Tel: (714) 727-4500 
Fax:(714)453-4610 

* Sacramento (NW) 

4311 Anthony CL, #100 
Rocklin, CA 95677 
Tel: (916) 652-0418 
Fax: (916) 652-0403 



San Diego (SW) 

5520 Ruffin Rd., 

Suite 209 

San Diego, CA 92123 
Tel: (619) 268-1277 
Fax:(619)492-9826 

* Sunnyvale (NW) 

1161 N. Fairoaks Ave. 
Sunnyvale, CA 94089 
Tel: (408) 734-8570 
Fax:(408)734-8875 

Colarado 

* Denver (NW) 

1873 S. BellaireSt,#100 
Denver, CO 80222-0000 
Tel: (303) 691-2460 
Fax:(303)691-9036 

Connecticut 

Hartford(NW) 

1064 East Main Street 
Meriden, CT 06450 
Tel: (203) 639-6000 
Fax: (203) 639-6005 

Florida 

* Orlando (Southern) 

650 S. North Lake Blvd, #400 
Altamonte Springs, FL 32701 
Tel: (407) 339-0078 
Fax:(407)339-0139 

Georgia 

Atlanta 

3000 Business Park Dr., #D 
Norcross, GA 30071 
Tel: (404)446-7167 
Fax: (404) 446-7264 



Illinois 

* Chicago (Central) 

870 Cambridge Drive 

Elk Grove Village, 1L 60007 
Tel: (708) 640-1910 
Fax: (708) 640-1926 

Indiana 

* Fort Wayne (Central) 
3433 E. Washington Blvd. 
Ft. Wayne, IN 46803 

Tel: (219) 422-4300 
Fax: (219)423-3420 

* Indianapolis (Central) 
5230 West 79th Street 
P.O. Box 6885 
Indianapolis, IN 46268 
Tel: (317) 875-8200 
Fax: (317) 875-8219 

Maryland 

Baltimore (Mid-Atlantic) 
8945 Guildford Rd., 

Suite 130 

Columbia, MD 21 046 , 
Tel: (410) 290-5100 
Fax: (410) 290-8006 

Massachusetts 

* Boston (NE) 

100 Burtt Road, #106 
Andover, MA 01810 
Tel: (508) 474-8880 
Fax:(508)474-8902 

New Jersey 

Fairfield (Mid-Atlantic) 

271 Route 46 West 
Suites F202-203 
Fairfield, NJ 07004 
Tel: (201) 227-6060 
Fax: (201) 227-2626 



New Mexico 

Albuquerque (SW) 

11728 Linn, N.E. 
Albuquerque, NM 87123 
Tel: (505) 292-2700 
Fax:(505)275-2819 

Ohio 

Cleveland (Central) 

31200 Solon Road, #11 
Solon, Ohio 44139 
Tel: (216) 498-2002 
Fax: (216) 498-2006 

* Dayton Industrial (Central) 
444 Windsor Park Drive 
Dayton, OH 45459 

Tel: (513) 435-5922 
Fax: (513)435-3122 

Dayton (Military) 

446 Windsor Park Drive 
Dayton, OH 45459 
Tel: (513) 434-8231 
Fax: (513) 434-8103 

Oregon 

Portland (NW) 

9275 S.W. Nimbus 
Beaverton, OR 97005 
Tel: (503) 644-3444 
Fax: (503)520-1948 

Pennysylvania 

* Philadelphia (Mid-Atlantic) 
2556 Metropolitan Drive 
Trevose, PA 19053 

Tel: (215) 953-2899 
Fax: (215)364-4927 

Texas 

Dallas (Southern) 

1701 Greenville Ave #306 
Richardson, TX 75081 
Tel: (214) 690-0466 
Fax: (214) 690-0467 
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Utah 

Salt Lake City (Northwest) 
6912 S. 185 West, Suite B 
Midvale, UT 84047 
Tel: (801)561-9691 
Fax: (801) 255-2477 

Washington 

Seattle (NW) 

1715 114th Ave. S.E.,#208 
Bellevue, WA 98004 
Tel: (206)646-8750 
Fax: (206) 646-8559 

Wisconsin 

Milwaukee (Central) 

W 226 N 900 Eastmound 
Waukesha, Wl 53186 
Tel: (414) 547-8879 
Fax: (414) 547-6547 



Bell Microproducts 
Branch Locations 

California 

Northern 

1941 RingwoodAve. 

San Jose, CA 95131 
Tel: (408)451-9400 
Fax: (408) 451-1600 

Southern 

18350 ML Langley, #207 
Fountaint Valley, CA 92708 
Tel: (714)963-0667 
Fax: (714) 968-3195 

860-H Hampshire 
Westlake Village, CA 91362 
Tel: (805) 496-2606 
Fax: (805) 496-6119 



Massachusetts 
16, Upton Drive 
Wilmington, MA01887 
Tel: (508) 658-0222 
Fax: (508) 694-9987 

Minnesota 
Minneapolis 
13513 McGinty Road E. 
Minnetoka, MN 55305 
Tel: (612) 933-3236 
Fax: (612) 933-3415 

New York/New Jersey 
1055 Parsippany Blvd., Ste 501 
Parsippany, NJ 07054 
Tel: (201) 402-5959 
Fax: (201)402-0424 

Texas 

100 North Central Expressway, 
Ste 502 

Richardson, TX 75080-5300 
Tel: (21 4) 783-4191 
Fax: (214)234-2123 

Washington 

18210 Redmond Way, Ste 302 
Redmond, WA 98052 
Tel: (206) 861-5710 
Fax: (206) 885-5399 



Marshall Industries 
Branch Locations 

Alabama 

Huntsville 

3313 Memorial Parkway South 
Huntsville, AL 35801 
Tel: (205)881-9235 



Arizona 
Phoenix 
9831 S. 51 St St. 

Suite Cl 07-1 09 
Phoenix, AZ 85044 
Tel: (602)496-0290 

California 

Irvine 

One Morgan 
Irvine, CA92718 
Tel: (714) 458-5301 

Los Angeles 
26637 West AgouraRd 
Calabasas, CA 91302 
Tel: (818) 878-7000 

Sacramento 
3039 Kilgore Avenue 
Rancho Cordova, CA 95670 
Tel: (916) 635-9700 

San Diego 
5961 Kearny Villa 
San Diego, CA 92123 
Tel: (619) 627-4140 

San Francisco 
336 Los Coches Street 
Milpitas, CA 95035 
Tel: (408) 942-4600 

Colorado 

Denver 

12351 North Grant 
Thornton, CO 80241 
Tel: (303) 451-8383 



Connecticut 
Connecticut 
20 Sterling Drive 
Barnes Industrial Park, North 
Post Office Box 200 
Wallingford, CT 06492-0200 
Tel: (203) 265-3822 

Florida 

Fort Lauderdale 
2700 W. Cypress Creek Rd., 
Suite Dll 4 

Ft. Lauderdale, FL 33309 
Tel: (305) 977-4880 

Orlando 

380 S. Northlake Boulevard, 
Suite 1024 

Altamonte Springs, FL 32701 
Tel: (407) 767-8585 

Tampa 

2840 Scherer Drive, Suite 41 0 
St. Petersburg, FL33716 
Tel: (813) 573-1399 

Georgia 

Atlanta 

5300 Oakbrook Parkway, 

Suite 140 

Norcross, GA 30093-9990 
Tel: (404)923-5750 

Illinois 

Chicago 

50 East Commerce Drive, 

Unit 1 

Schaumburg, IL60173 
Tel: (708) 490-0155 
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Indiana 
Indianapolis 
6990 Corporate Drive 
Indianapolis, IN 46278 
Tel: (317) 297-0483 

Kansas 
Kansas City 

10413 West 84th Terrace 
Pine Ridge Business Park 
Lenexa, KS 66214 
Tel: (913) 492-31 21 

Massachusetts 

Boston 

33 Upton Drive 
Wilmington, MA 01887 
Tel: (508) 668-0810 

Maryland 

Maryland 

2221 Broadbirch Drive 
Silver Springs, MD 20904 
Tel: (301) 622-1118 

Michigan 

Michigan 

31067 Schoolcraft 
Livonia, Ml 48150 
Tel: (313) 525-5860 

Minnesota 
Minneapolis 
14800 28th Avenue 
Suite 175 

Plymouth, MN 65447 
Tel: (612) 559-2211 



Missouri 
St. Louis 

3377 Hollenberg Drive 
Bridgeton, MO 63044 
Tel: (314) 291-4650 

New Jersey 
North New Jersey 
101 Fairfield Road 
Fairfield, NJ 07006 
Tel: (201) 882-0320 

New York 
Binghamton 
100 Marshall Drive 
Endicott, NY 13760 
Tel: (607) 785-2346 

Long Island 
95 Oser Avenue 
Haupanuge, NY 11788 
Tel: (516) 273-2695 

Rochester 

1250Scottsvllle Road 
Rochester, NY 14624 
Tel: (716) 236-7620 

North Carolina 
Raleigh 

5224 Greens Dairy Road 
Raleigh, NC 27604 
Tel: (919) 878-9882 

Ohio 

Cleveland Branch 
30700 Bainbridge Road, Unit A 
Solon, OH 44139 
Tel: (216) 248-1788 



Dayton 

3620 Park Center Drive 
Dayton, OH 45414 
Tel: (513) 898-4480 

Oregon 

Portland 

9705 S.W. Gemini Drive 
Beaverton, OR 97005 
Tel: (503)644-5050 

Pennsylvania 
Philadelphia 
158 Gaither Drive 
Mt. Laurel, NJ 08054 
Tel: (609)234-9100 

Texas 

Austin 

8504 Cross Park Drive 
Austin, TX 78754 
Tel: (512) 837-1991 

Dallas 

Corporate Square Tech 
Center III 

1651 North Glenville Drive 
Richardson, TX 75081 
Tel: (214) 706-0600 

Houston 

10681 Haddington Drive, 
Suite 160 

Houston, TX 77043 
Tel: (713) 467-1666 

Utah 

Salt Lake City 
2355 South 1070 W 
Suite D 

Salt Lake City, Utah 84119 
Tel: (801) 973-2288 



Washington 

Seattle 

11715 North Creek Pkwy South, 
Suite112 
Bothell, WA 98011 
Tel: (206) 486-5747 

Wisconsin 

Milwaukee 

Crossroads Corporate Center 1 
20900 Swenson Drive, Suite 
150 

Waukesha, Wl 53186 
Tel: (414) 797-8400 



Canada 

G.S.Marshall Company 

Toronto 
4 Paget Road 
Units 10 and 11 
Building 1112 
Brampton, Ontario 
L6T6G3 

Tel: (416) 458-8046 
Montreal 

148 Brunswick Boulevard 
Pointe Claire, Quebec H9R 5P9 
Tel: (514) 694-8142 



United Kingdom 
Macro Marketing, LTD. 

Burnham Lane 
Slough SL16LN 
United Kingdom 
Tel: (44) 628 604 383 
Fax: (44) 628666873 
/668071 



IBM Sales Offices 





IBM Sales Offices 



Blue Microelectronics 
Limited 

Albion House, 

Victoria Promenade 
Northhampton, NN1, 1HH 
United Kingdom 
Tel: (44) 604 603310 
Fax: (44) 604 603 320 


Leading 

Technologies SA 

1 Avenue des Neuvilles 
1920 Martigny 
Switzerland 
Tel: (41)26-232 257 
Fax: (41)26-228 609 


Germany 

ITT Distribution 
Postfach 1246 
Bahnhofstrasse 44 
D-71693 Moglingen, Germany 
Tel: 0130 85 7314 
Fax: 0130 8333 01 


Aviv Electronics 

4, HayetziraSt. 
Radnana, 43100 
Israel 

Tel: (972)9-983232 
Fax: (972) 9-916510 


France 

A2M 

5, rue Carle Vernet 
92315 SEVRES CEDEX 
France 

Tel:(33)-1 -46-23-79-00 
Fax: (33)-1 -46-23-79-23 









INTERNATIONAL 
SALES OFFICES 



IBM Microelectronics 

Department 1045 

224 Boulevard J.F.Kennedy 

91105 Corbeil-Essonnes, 

CEDEX 

France 

Tel: (33) 1-60885167 
Fax: (33) 1-60 884920 



IBM Microelectronics 

Europe 

Department 8142 
Tour Descartes, 

CEDEX 50 
F.92066 Paris 
La Defense 
France 

Tel: (33)1-49 05 8533 
Fax: (33)1-47887912 



IBM Microelectronics 

Department R0260 
800 Ichimiyake, 

Yasu-cho, Yasu-gun, Shiga-ken 
Japan 520-23 
Tel: (81)775-87-4745 
Fax: (81) 775-87-4735 
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